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Hmmaiini Proteins 

Field of the Invention 
The present invention relates to genes encoding novel human proteins which 
exhibit a variety useful biological activities. More specifically, isolated nucleic acid 

5 molecules are provided which encode polypeptides comprising various forms of 

human proteins. Human polypeptides are also provided, as are vectors, host cells and 
recombinant methods for producing the same. Also provided are methods for 
detecting nucleic acids or polypeptides related to those of the invention, for example, 
to aid in identification of a biological sample or diagnosis of disorders related to 

10 expression of protein genes of this invention. The invention further relates to 

methods for identifying agonists and antagonists of the proteins of the invention, as 
well as to methods for treatment of disorders related to protein gene expression using 
polypeptides, antagonists and agonists of the invention. 

Background of the Invention 
15 Identification and sequencing of human genes is a major goal of modern 

scientific research. For example, by identifying genes and determining their sequences, 
scientists have been able to make large quantities of valuable human gene products. 
These include human insulin, interferon, Factor VIII, human growth hormone, tissue 
plasminogen activator, erythropoeitin and numerous other proteins. Additionally, 
20 knowledge of gene sequences can provide keys to diagnosis, treatment or cure of 
genetic diseases such as muscular dystrophy and cystic fibrosis. 

Despite the great progress that has been made in recent years, only a small 
number of genes which encode the presumably thousands of human proteins have 
been identified and sequenced. Therefore, there is a need for identification and 
25 characterization of novel human proteins and corresponding genes which can play a 
role in detecting, preventing, ameliorating or correcting disorders related to abnormal 
expression of and responses to such proteins. 
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Summary of the Invention 

The present invention provides isolated nucleic acid molecules comprising 

(see Example 2) by a reference number designated as a "Protein ID (Idennfler, (e.g., 
. W 3 5 3^n.Eachprote m of*einven«oni S rela«d tt abumancomplemen«ry 

protein. T.ecDNAclouerelatedtoeachproteinof^ — 

' cD NA Clone ID (Identifier)" in Table 1 (e.g., "HABCE99"). DNA of each cDNA 

^taT^li. — ^^^^^^ 

Culture Collection and giventhe ATCC Deposit Number shown for eachcDNA 

Clone ID in Table 1, as further described below. 

Their^ennonprovidesanucleondesequencedeternunedforanmFNA 

X for me determined nucleotide science of each protein is an integer specfien 
Tab ,e 1 Tnede.ennmednucieotideseauer^providedforea.hproteinofu.e 

methods to DNA of the corresponding deposited cDNA Cone crted mTable , _ 

memvennonhasb.ntrans.ated provide a determined amino acid se^,uence for «ch 
protemv^chisidenUfi^inTablelbyaSEQlDNO-'r^re^vaueofV 

.eproteinandeonUnuinguntilmefust — tenuinanon "stop ).don. Due 
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to sequence the cDNA clones described herein, occasional nucleotide sequence errors 
are expected in the determined nucleotide sequences of the invention. These errors 
may include insertions or deletions of one or a few nucleotides in the determined 
nucleotide sequence as compared to the actual nucleotide sequence of the deposited 
cDNA. As one of ordinary skill would appreciate, incorrect insertions or deletions of 
one or two nucleotides into a determined nucleotide sequence leads to a shift in the 
translation reading frame compared to the reading frame actually encoded by a cDN A 
clone. Further, such a shift in frame within an actual open reading frame frequently 
leads to the appearance of a translation termination (stop) codon within the sequence 
encoding the polypeptide. Accordingly, due to occasional errors in the nucleotide 
sequences determined from the deposited cDNAs and any related DN A clones used to 
prepare the determined sequence for the mRNA encoding each secreted protein of the 
invention, the translations shown as determined amino acid sequences in SEQ ID 
NO:Y may represent only a portion of the complete amino acid sequence of the 
human secreted protein actually encoded by the mRNA represented by the 
corresponding cDNA clone in the ATCC deposit identified in Table 1. In any event, 
the determined amino acid sequence for each protein in Table 1, which is shown in 
SEQ ID NO:Y for each protein, comprises at least a portion of the amino acid 
sequence determined for that protein. 

More particularly, the determined amino acid sequence is the amino acid 
sequence translated from the determined nucleotide sequence in the open reading frame 
of the first amino acid of the ORF to the last amino acid of that frame. In other 
words, the determined amino acid sequence is translated from the determined 
nucleotide sequence beginning at the codon having as its 5 ' end the nucleotide in the 
position of SEQ ID NO:X identified in Table 1 as the 5' nucleotide of the first amino 
acid (abbreviated in Table 1 as "5' NT of First AA"). Translation of the determined 
nucleotide sequence is continued in the reading frame of that first amino acid codon to 
the first stop codon in that same open reading frame, i.e., to the position in SEQ ID 
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NO:X which encodes the amino acid at the position in SEQ ID NO:Y identified as the 
"last amino acid of the open reading frame" (abbreviated as "Last AA of ORF"). 

For any determined amino acid sequence in which the first amino acid is the 
methionine encoded by the translation initiation codon for the protein, Table 1 also 

5 identifies the position in SEQ ID NO:X of the 5 s nucleotide of the start codon ("5* 
NT of Start Codon") as the same position in SEQ ID NO:X as that of the 5* 
nucleotide of the first amino acid ("First AA"). 

Table 1 also identifies the positions in SEQ ID NO:Y of the last amino acid of 
the signal peptide ("Last AA of Sig Pep") and the first amino acid of the secreted 

10 portion ("First AA of Secreted Portion") of the protein, for those polypeptide having 
a secretory leader sequence. The "secreted portion" of a secreted protein in the 
present context indicates that portion of the complete polypeptide translated from an 
mRNA which remains after cleavage of the signal peptide by a signal peptidase. In 
this context the term "mature" may also be used interchangeably with "secreted 

15 portion" although it is recognized that in other contexts "mature" may designate a 
portion of a "proprotein" which is produced by further cleavage of the polypeptide 
after cleavage of the signal peptide. 

Accordingly, in one aspect the invention provides an isolated nucleic acid 
molecule comprising a nucleotide sequence which is identical to the nucleotide 

20 sequence of SEQ ID NO:X, where X is any integer as defined in Table 1 . The 
invention also provides an isolated nucleic acid molecule comprising a nucleotide 
sequence which is identical to a portion of the nucleotide sequence of SEQ ID NO:X, 
for instance, a sequence of at least 50, 100 or 150 contiguous nucleotides in the 
nucleotide sequence of SEQ ID NO:X. Such a portion of the nucleotide sequence of 

25 SEQ ID NO:X may be described most generally as a sequence of at least C contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X where: (1) the sequence of at 
least C contiguous nucleotides begins with the nucleotide at position N of SEQ ID 
NO:X and ends with the nucleotide at position M of SEQ ID NO:X; (2) C is any 
integer in the range beginning with a convenient primer size, for instance, about 20, to 
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the total nucleotide sequence length ("Total NT Seq ) as set forth for SEQ ID NO:X 
in Table 1 ; (3) N is any integer in the range of 1 to the first position of the last C 
nucleotides in SEQ ID NO:X, or more particularly, N is equal to the value of Total 
NT Seq. minus the quantity C plus 1 (i.e., Total NT Seq.-(01)); and (4) M is any 

5 integer in the range of C to Total NT Seq. 

Preferably, the sequence of contiguous nucleotides in the nucleotide sequence 
of SEQ ID NO:X is included in SEQ ID NO:X in the range of positions beginning 
with the nucleotide at about the 5 5 nucleotide of the clone sequence ("5* NT of Clone 
Seq." in Table 1) and ending with the nucleotide at about the 3' nucleotide of the clone 

10 sequence ("3' NT of Clone Seq." in Table 1). More preferably, the sequence of 
contiguous nucleotides is in the range of positions beginning with the nucleotide at 
about the position of the 5' Nucleotide of the Start Codon ("5' NT of Start Codon" 
in Table 1 ) and ending with the nucleotide at about the position of the 3 * Nucleotide 
of the Clone Sequence as set forth for SEQ ID NO:X in Table 1. For instance, one 

15 preferred embodiment of this aspect of the invention is an isolated nucleic acid 

molecule which comprises a sequence at least 95%, 96%, 97%, 98%, or 99% identical 
to a sequence of about 500 contiguous nucleotides included in the nucleotide sequence 
of SEQ ID NO:X beginning at about the 5' NT of Start Codon position as set forth 
for SEQ ID NO:X in Table 1 . Another preferred embodiment of this aspect of the 

20 invention is a nucleic acid molecule comprising a nucleotide sequence which is at least 
95% identical to the nucleotide sequence of SEQ ID NO:X beginning with the 
nucleotide at about the position of the 5' Nucleotide of the First Amino Acid of the 
Signal Peptide and ending with the nucleotide at about the position of the 3 ? 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

25 Further embodiments of the invention include isolated nucleic acid molecules 

which comprise a nucleotide sequence at least 90% identical, and more preferably at 
least 95%, 96%, 97%, 98%, 99% or 99.9% identical, to any of the determined 
nucleotide sequences above. For instance, one such embodiment is an isolated nucleic 
acid molecule comprising a nucleotide sequence which is at least 95% identical to a 
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sequence of at least 50 contiguous nucleotides in the nucleotide sequence of SEQ ID 
NO:X wherein X is any integer as defined in Table 1 . Another embodiment of this 
aspect of the invention is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to the complete nucleotide sequence of SEQ 
IDNO:X. 

Isolated nucleic acid molecules which hybridize under stringent hybridization 
conditions to a nucleic acid molecule described above also are provided. Such a nucleic 
acid molecule which hybridizes does not hybridize under stringent hybridization 
conditions to a nucleic acid molecule having a nucleotide sequence consisting of only A 
residues or of only T residues. 

The invention further provides a composition of matter comprising a nucleic 
acid molecule which comprises a human cDNA clone identified by a cDN A Clone ID 
(Identifier) in Table 1 , which DN A molecule is contained in the material deposited 
with the American Type Culture Collection and given the ATCC Deposit Number 
shown in Table 1 for that cDNA clone. As described further in Example 1, this 
deposited material comprises a mixture of plasmid DNA molecules containing cloned 
cDNAs of the invention. Further, the invention provides an isolated nucleic acid 
molecule comprising a nucleotide sequence which is, for instance, at least 95% 
identical to a sequence of at least 50, 150 or 500 contiguous nucleotides in the 
nucleotide sequence encoded by a human cDNA clone contained in the deposit given 
the ATCC Deposit Number shown in Table 1 . One preferred embodiment of this 
aspect is an isolated nucleic acid molecule comprising a nucleotide sequence which is 
at least 95% identical to the complete nucleotide sequence encoded by a human cDNA 
clone identified in Table 1 and as contained in the deposit with the ATCC Deposit 
Number shown in Table 1 . Also provided are isolated nucleic acid molecules which 
hybridize under stringent hybridization conditions to a nucleic acid molecule 
comprising a nucleotide sequence encoded by a human cDNA clone identified in Table 
1 and contained in the cited deposit. 
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These nucleic acid molecules of the invention may be used for a variety of 
identification and diagnostic purposes. For instance, the invention provides a method 
for detecting in a biological sample a nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least 50 contiguous 

5 nucleotides in a nucleotide sequence of the invention. The sequence of the nucleic acid 
molecule used in this method is selected from the group consisting of: a nucleotide 
sequence of SEQ ID NO:X wherein X is any integer as defined in Table 1 ; and a 
nucleotide sequence encoded by a human cDNA clone identified by a cDNA Clone 
Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 

10 shown for said cDNA clone in Table 1 . This method of the invention comprises a 

step of comparing a nucleotide sequence of at least one nucleic acid molecule in the 
biological sample with a sequence selected from the group above, and determining 
whether the sequence of the nucleic acid molecule in the sample is at least 95% 
identical to the selected sequence. The step of comparing sequences may comprise 

1 5 determining the extent of nucleic acid hybridization between nucleic acid molecules in 
the sample and a nucleic acid molecule comprising the sequence selected from the 
above group. Alternatively, this step may be performed by comparing the nucleotide 
sequence determined from a nucleic acid molecule in the sample, for instance by 
automated DNA sequence methods, with the sequence selected from the above group. 

20 In another aspect, the invention provides methods for identifying the species, 

tissue or cell type of a biological sample based on detecting nucleic acid molecules in 
the sample which comprise a nucleotide sequence of a nucleic acid molecule of the 
invention (for instance, a nucleic acid molecule comprising a nucleotide sequence that 
is at least 95% identical to at least a portion of a nucleotide sequence of SEQ ID 

25 NO:X or a nucleotide sequence encoded by a human cDNA clone identified in Table 1 
as contained in the deposit with the ATCC Deposit Number shown therein. This 
method may be conducted by detecting a nucleotide sequence of an individual cDNA 
of the invention or using panel of nucleotide sequences of the invention. Thus, this 
method may comprise a step of detecting nucleic acid molecules comprising a 
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nucleotide sequence in a panel of at least two nucleotide sequences, where at least one 
sequence in the panel is at least 95% identical to at least a portion of a nucleotide 
sequence of SEQ ID NO:X or a nucleotide sequence encoded by a human cDNA clone 
contained in the ATCC deposit. In this method for identifying the species, tissue or 
5 cell type of a biological sample, the detection of nucleic acid molecules comprising 

nucleotide sequences of the invention may be conducted by various techniques known 
in the art including, for instance, hybridization of either DNA or RNA probes to 
either DNA or RNA molecules obtained from the biological sample, as well as 
computational comparisons of nucleotide sequences determined from nucleic acids in a 

10 biological sample with nucleotide sequences of the invention. 

Similarly, nucleic acid molecules of the invention may be used in a method for 
diagnosing in a subject a pathological condition associated with abnormal structure or 
expression of a gene encoding a protein identified in Table 1 . This method may 
comprise a step of detecting in a biological sample obtained from the subject nucleic 

15 acid molecules comprising a nucleotide sequence that is at least 95% identical to at 
least a portion of a nucleotide sequence of SEQ ID NO:X or a nucleotide sequence 
encoded by a human cDNA clone identified in Table 1 as contained in the deposit 
with the given ATCC Deposit Number. Again, this diagnostic method may involve 
analysis of individual nucleotide sequences or panels of several nucleotide sequences, 

20 and the analysis of either DNA or RNA species using either DNA or RNA probes. 

For use in identification or diagnostic methods such as those described above, 
therefore, the invention also provides a composition of matter comprising isolated 
nucleic acid molecules in which the nucleotide sequences of the nucleic acid molecules 
comprise a panel of sequences, at least one of which is at least 95% identical to a 

25 sequence, either a nucleotide sequence of SEQ ID NO:X or a nucleotide sequence 

encoded by a human cDNA clone contained in the ATCC deposit in Table 1 . In this 
composition, the nucleic acid molecules may comprise DNA molecules or RNA 
molecules or both, as well as polynucleotide equivalents of DNA and RNA which are 
not naturally occurring but are known in the art as such. 
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Another aspect of the invention relates to polypeptides comprising amino acid 
k sequences encoded by nucleotide sequences of the invention. For identification and 
diagnostic purposes, these polypeptides need not include the amino acid sequence of a 
complete secreted protein or even of the secreted form of such a protein, since, for 
5 instance, antibodies may bind specifically to a linear epitope of a polypeptide which 
comprises as few as 6 to 8 amino acids. Accordingly, the invention also provides an 
isolated polypeptide comprising an amino acid sequence at least 90%, preferrably 
95%, 96%, 97%, 98%, or 99% identical to a sequence of at least about 10, 30 or 100 
contiguous amino acids in the amino acid sequence of SEQ ID NO: Y wherein Y is any 

10 integer as defined in Table 1. Preferably, the sequence of contiguous amino acids is 
included in the amino acid sequence of SEQ ID NO: Y beginning with the residue at 
about the position of the First Amino Acid of the Secreted Portion where one exists or 
the first amino acid of the open reading frame if the protein is not indicated as having a 
signal peptide and ending with the residue at about the Last Amino Acid of the Open 

15 Reading Frame as set forth for SEQ ID NO:Y in Table 1 . A preferred embodiment of 
this aspect relates to an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the complete amino acid sequence of SEQ ID NO:Y. 

As noted above, however, the determined amino acid sequence of SEQ ID 
NO: Y may not include the complete amino acid sequence of the protein encoded by , 

20 each cDNA in the ATCC deposit identified in Table 1 . Accordingly, the invention 
further provides an isolated polypeptide comprising an amino acid sequence at least 
90% identical, preferrably at least 95%, 96%, 97%, 98% or 99% identical to a 
sequence of at least about 10, 300 or 100 contiguous amino acids in the complete 
amino acid sequence of a secreted protein encoded by a human cDNA clone identified 

25 by a cDNA Clone Identifier in Table 1 and contained in the deposit with the ATCC 
Deposit Number shown for that cDNA clone in Table 1 . A particularly preferred 
embodiment of this aspect is a polypeptide in which the sequence of contiguous 
amino acids is included in the amino acid sequence of a secreted ("mature") portion of 
the protein encoded by a human cDNA clone contained in the deposit, particularly a 
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polypeptide comprising the entire amino acid sequence of the secreted portion of the 
secreted protein encoded by a human cDNA clone of the invention. 

For purposes such as tissue identification and diagnosis of pathological 
conditions, the invention also provides an isolated antibody which binds specifically 

5 to a polypeptide comprising an amino acid sequence of the invention, (for instance, a 
sequence that is identical to a sequence of at least 6, preferrably at least 7, 8, 9 or 10, 
contiguous amino acids in an amino acid sequence of SEQ ID NO: Y or in a complete 
amino acid sequence of a protein encoded by a human cDNA clone identified by a 
cDNA Clone Identifier in Table 1 and contained in the deposit cited therein. Further 

10 in the same vein, the invention provides a method for detecting in a biological sample a 
polypeptide comprising an amino acid sequence which is identical to a sequence of at 
least 6, preferrably at least 7, 8, 9 or 10 contiguous amino acids in a sequence selected 
from the group consisting of an amino acid sequence of SEQ ID NO: Y and a complete 
amino acid sequence of a protein encoded by a human cDNA clone identified by a 

15 cDNA Clone Identifier in Table 1 and contained in the deposit with the ATCC . 

Deposit Number shown for that cDNA clone in Table 1 ;. This method comprises a 
step of comparing an amino acid sequence of at least one polypeptide molecule in said 
sample with a sequence selected from the above group and determining whether the 
sequence of that polypeptide molecule in the sample is identical to the selected 

20 sequence of at least 6-10 contiguous amino acids. This step of comparing an amino 
acid sequence of at least one polypeptide molecule in the sample with a sequence 
selected from the above group may comprise determining the extent of specific binding 
of polypeptides in the sample to an antibody which binds specifically to a 
polypeptide comprising an amino acid sequence of the invention. Alternatively, this 

25 comparison step may be performed by comparing the amino acid sequence determined 
from a polypeptide molecule in the sample with the sequence selected from the above 
group, for instance, using computational methods. 

The invention further provides methods for identifying the species, tissue or 
cell type of a biological sample comprising a step of detecting polypeptide molecules 
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in the sample which include an amino acid sequence that is identical to a sequence of at 
least 6-10 contiguous amino acids an amino acid sequence of SEQ ID NO:Y or of a 
cDNA identified in Table 1 and contained in the cited deposit. This method may 
involve analyses of polypeptides for the presence of individual amino acid sequences 
of the invention or of panels of such sequences. Similarly provided are methods for 
diagnosing in a subject a pathological condition associated with abnormal structure or 
expression of a gene encoding a protein identified in Table 1 . In preferred 
embodiments of these methods of the invention for identification or diagnosis, an 
antibody which binds specifically to a polypeptide comprising an amino acid 
sequence of the invention is used to analyze amino acid sequences of polypeptides in 
a biological sample. 

In yet another aspect, the invention provides recombinant means for making a 
polypeptide comprising all or a portion of an amino acid sequence of the invention. 
For this purpose, an isolated nucleic acid molecule comprising a nucleotide sequence 
which is, for instance, at least 95% identical to a nucleotide sequence encoding a 
polypeptide which comprises an amino acid sequence of the invention (for instance, 
one that is at least 90% identical to SEQ ID NO: Y. 

It will be readily appreciated by one of ordinary skill that, due to the 
degeneracy of the genetic code, any nucleotide sequence encoding the amino acid 
sequence of a given protein needs to share only a low level of identity with the 
nucleotide sequence of a human cDN A clone which encodes the identical amino acid 
sequence of that protein. It will be further appreciated that the nucleotide of the 
deposited cDNAs presumably all comprise codons optimized for expression by 
human cells from which the cDNAs originated. Therefore, for improved expression in 
recombinant prokaryotic host cells, for instance, it may be desirable to alter the codon 
usage in a nucleic acid molecule encoding an amino acid sequence of the invention, 
selecting codons in accordance with the redundancy of the genetic code, which provide 
optimal codon usage in the selected host. Preferred nucleic acid molecules of this 
aspect of the invention are those which encode a polypeptide which comprises an 
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complete amino acid sequence of SEQ ID NO:Y or a complete amino acid sequence of 
a protein encoded by a human cDNA clone identified in Table 1 and contained in the 
deposit cited therein. 

Using such nucleic acid molecules encoding polypeptides of the invention, the 
invention further provides recombinant means for making the polypeptides. Thus, 
included is a method of making a recombinant vector comprising inserting an isolated 
nucleic acid molecule of the invention into a vector, as well as a recombinant vector 
produced by this method. Also included is a method of making a recombinant host 
cell comprising introducing a vector of the invention into a host cell, and a recombinant 
host so made. Such cells are useful, for instance, in a method of making an isolated 
polypeptide of the invention which comprises culturing a recombinant host cell under 
conditions such that the polypeptide is expressed and recovering the polypeptide. 

In a preferred embodiment of this method, the recombinant host cell is a 
eukaryotic cell and the polypeptide encoded by the nucleic acid of the invention 
encodes the complete amino acid sequence of a protein encoded by a cDN A identified 
in Table 1, so that the polypeptide produced by this method is a secreted ("mature") 
portion of a human secreted protein of the invention (i.e., one comprising an amino 
acid sequence of SEQ ID NO:Y beginning with the residue at the position identified n 
Table 1 as the First AA of Secreted Portion of SEQ ID NO: Y or an amino acid 
sequence of a secreted portion of a secreted protein encoded by a human cDNA clone 
identified in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown in Table 1 . The invention further provides an isolated polypeptide which is a 
secreted portion of a human secreted protein produced by the above method. Where 
the polypeptide shown in Table 1 does not have a leader sequence one may be 
provided by the vector. Such vectors are known in the art and are discussed below. 

In yet another aspect, the invention provides a method of treatment of an 
individual in need of an increased level of a secreted protein activity. As described 
herein, diagnostic methods of the invention enable the identification of such 
individuals, that is, individuals with a pathological condition involving a particular 
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organ, tissue or cell type, exhibiting lower levels of expression product (e.g., mRNA 
or antigen) of a given secreted protein in that organ, tissue or cell type, or those with 
mutant expression products, compared with normal individuals not suffering from the 
pathology. The method of the invention for treatment of an individual with such a 
pathological condition comprises administering to such an individual a pharmaceutical 
composition comprising an amount of an isolated polypeptide of a secreted protein of 
the invention effective to increase the level of activity of that secreted protein in the 
individual. 

Agonists and antagonists of the polypeptides of the invention and methods for 
using these also are provided. 

Brief Description of the Drawings 

Figure 1 shows the nucleotide sequence and deduced amino acid sequence of 
CCV (HEMFI85), SEQ ID NOS: 1 and 2, respectively. 

Figure 2 shows the nucleotide sequence and deduced amino acid sequence of 
CAT-1 (HTXET53), SEQ ID NOS:3 and 4, respectively. 

Figure 3 shows the nucleotide sequence and deduced amino acid sequence of 
CAT-2 (HT3SG28), SEQ ID NOS:5 and 6, respectively. 

Figure 4 shows the nucleotide sequence and deduced amino acid sequence of 
MIA-2 (HBXAK03), SEQ ID NOS:7 and 8, respectively. 

Figure 5 shows the nucleotide sequence and deduced amino acid sequence of 
MIA-3 (HLFBD44), SEQ ID NOS:9 and 10, respectively. 

Figure 6 shows the nucleotide sequence and deduced amino acid sequence of 
AIF-2 (HEBGM49), SEQ ID NOSrll and 12, respectively. 

Figure 7 shows the nucleotide sequence and deduced amino acid sequence of 
AIF-3 (HNGBH45), SEQ ID NOS:13 and 14, respectively. 

Figure 8 shows the nucleotide sequence and deduced amino acid sequence of 
Annexin (HSAAL25), SEQ ID NOS: 15 and 16, respectively. 

Figure 9 shows the nucleotide sequence and deduced amino acid sequence of 
ES/130-like I (HUSAX55), SEQ ID NOS:17 and 18, respectively. 



WO 98/31800 PCT/US98/0096O 

-14- 

Figure 10 shows the nucleotide sequence and deduced amino acid sequence of 
BEF (HSXCK41), SEQ ID NOS:19 and 20, respectively. 

Figure 1 1 shows the nucleotide sequence and deduced amino acid sequence of 
ADF (HFKFY79), SEQ ID NOS:21 and 22, respectively. 
5 Figure 12 shows the nucleotide sequence and deduced amino acid sequence of 

Bel-like (HAICH28), SEQ ID NOS:23 and 24, respectively. 

Detailed Description 

Nucleic Acid Molecules 

Nucleotide Sequences amid ATCC Deposits of cDNA Clones Encoding 

10 Human Proteins 

The present invention provides isolated nucleic acid molecules comprising 
polynucleotide sequences which have been identified as sequences encoding human 
proteins. The invention further provides a nucleotide sequence determined from an 
mRNA molecule encoding each human protein identified in Table 1, which comprises 

15 all or a substantial portion of the complete nucleotide sequence of the mRNA 

encoding each protein of the invention and has been assigned a SEQ ID NO = "X" in 
the Sequence Listing and Figures hereinbelow, 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 

20 naturally-occiirring nucleic acid molecule or polynucleotide present in a living 
organism is not isolated, but the same nucleic acid molecule or polynucleotide, 
separated from some or all of the coexisting materials in the natural environment, is 
isolated. Such nucleic acid molecule could be part of a vector and/or such 
polynucleotide could be part of a composition, and still be isolated in that such vector 

25 or composition is not part of the natural environment of the nucleic acid molecule or 
polynucleotide. 

By "nucleotide sequence" of a nucleic acid molecule or polynucleotide is 
intended, for a DNA molecule or polynucleotide, a sequence of deoxyribonucieotides, 
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and for an RNA molecule or polynucleotide, the corresponding sequence of 
ribonucleotides (A, G, C and U), where each thymidine deoxyribonucleotide (T) in the 
specified deoxyribonucleotide sequence is replaced by the ribonucleotide uridine (U). 
Using the information provided herein, such as a nucleotide sequence shown in 
5 the sequence listing, a nucleic acid molecule of the present invention encoding a 

polypeptide may be obtained using standard cloning and screening procedures, such as 
those for cloning cDNAs using mRNA as starting material. The present invention 
provides not only the determined nucleotide sequences of the mRNA encoding each 
human secreted protein of the invention, as set forth in SEQ ID NO:X for each 
10 protein, but also a sample of plasmid DNA containing a cDN A of the invention 

deposited with the American Type Culture Collection (Rockville, MD), as set forth in 
Table 1. These deposits enable recovery of each cDNA clone and recombinant 
production of each secreted protein of the invention actually encoded by a cDNA 
clone identified in Table 1, as further described hereinbelow. 
15 Nucleic acid molecules of the present invention may be in the form of RNA, 

such as mRNA, or in the form of DNA, including, for instance, cDNA and genomic 
DNA obtained by cloning or produced synthetically. The DNA may be 
double-stranded or single-stranded. Single-stranded DNA or RNA may be the coding 
strand, also known as the sense strand, or it may be the non-coding strand, also 
20 referred to as the anti-sense strand. 

In addition to nucleic acid molecules comprising a determined nucleotide 
sequence in SEQ ID NO:X or the nucleotide sequence of a deposited human cDNA 
clone, isolated nucleic acid molecules of the invention include DNA molecules which 
comprise a sequence substantially different from those described above but which, due 
25 to the degeneracy of the genetic code, still encode the proteins shown in the sequence 
listing or those encoded by the clones contained in the deposited plasmids. Of course, 
the genetic code and species-specific codon preferences are well known in the art. 
Thus, it would be routine for one skilled in the art to generate the degenerate variants 
described above, for instance, to optimize codon expression for a particular host (e.g., 
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change codons in the human mRNA to those preferred by a bacterial host such as E. 
coli). Preferably, this nucleic acid molecule will encode a secreted portion (mature 
polypeptide) encoded by the deposited cDNA. 

The invention further provides a nucleic acid molecule having a sequence 
complementary to one of the above sequences. Such isolated molecules, particularly 
DN A molecules, are useful as probes for gene mapping, by in situ hybridization with 
chromosomes, and for detecting expression of the corresponding gene(s) in human 
tissue, for instance, by Northern blot analysis. 

The present invention is further directed to nucleic acid molecules encoding 
portions of the nucleotide sequences described herein as well as to fragments of the 
isolated nucleic acid molecules described herein. By a "fragment" of an isolated 
nucleic acid molecule having the nucleotide sequence of the deposited cDNA or the 
nucleotide sequence shown in the sequence listing is intended fragments at least about 
1 5 nt, and more preferably at least about 20 nt, still more preferably at least about 30 
nt, and even more preferably, at least about 40 nt in length which are useful as 
diagnostic probes and primers as discussed herein. Of course, larger fragments 50-500 
nt in length are also useful according to the present invention as are fragments 
corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or 
as shown in the sequence listing. By a fragment "at least 20 nt in length," for example, 
is intended fragments which include 20 or more contiguous bases from the nucleotide 
sequence of the deposited cDNA or the determined nucleotide sequence shown in 
SEQ ID NO:X. Preferred nucleic acid fragments of the present invention include 
nucleic acid molecules encoding epitope-bearing portions of the polypeptides of the 
present invention, as described further below. 

In another aspect, the invention provides an isolated nucleic acid molecule 
comprising a polynucleotide which hybridizes under stringent hybridization 
conditions to a portion of a nucleic acid molecule of the invention described above, for 
instance, a cDNA contained in the plasmid sample deposited with the ATCC. By 
"stringent hybridization conditions" is intended overnight incubation at 42° C in a 
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solution comprising: 50% formamide, 5x SSC (150 mMNaCl, 15 mM trisodium 
citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran 
sulfate, and 20 p,g/ml denatured, sheared salmon sperm DNA, followed by washing 
the filters in 0. 1 x SSC at about 65° C. 
5 By a polynucleotide which hybridizes to a "portion" of a polynucleotide is 

intended a polynucleotide (either DNA or RNA) hybridizing to at least about 15 
nucleotides (nt), and more preferably at least about 20 nt, still more preferably at least 
about 30 nt, and even more preferably about 30-70 (e.g., 50) nt of the reference 
polynucleotide. These are useful as diagnostic probes and primers as discussed above 
10 and in more detail below. For certain applications, such as the FISH technique for 
gene mapping on chromosomes, probes of 500 nucleotides up to 2000 nucleotides 
may be preferred. 

By a portion of a polynucleotide of "at least 20 nt in length," for example, is 
intended 20 or more contiguous nucleotides from the nucleotide sequence of the 

15 reference polynucleotide (e.g., the deposited cDNA or the nucleotide sequence as 
shown in SEQ ID NO:X). Of course, a polynucleotide which hybridizes only to a 
poly A sequence (such as any 3' terminal poly (A) tract of a cDNA shown in the 
sequence listing), or to a complementary stretch of T (or U) residues, would not be 
included in a polynucleotide of the invention used to hybridize to a portion of a 

20 nucleic acid of the invention, since such a polynucleotide would hybridize to any 
nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., 
practically any double-stranded cDNA clone). 

Also encoded by nucleic acids of the invention are the amino acid sequences of 
the invention together with additional, non-coding sequences, including for example, 

25 but not limited to introns and non-coding 5' and 3' sequences, such as the transcribed, 
non-translated sequences that play a role in transcription, mRNA processing, 
including splicing and polyadenylation signals, for example ^ ribosome binding and 
stability of mRNA; and additional coding sequence which codes for additional amino 
acids, such as those which provide additional functionalities. 
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Thus, the sequence encoding the polypeptide may be fused to a marker 
sequence, such as a sequence encoding a peptide which facilitates purification of the 
fused polypeptide. In certain preferred embodiments of this aspect of the invention, 
the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in 

5 a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 9131 1), among 
others, many of which are commercially available. As described in Gentz et al, Proa 
Natl Acad Set USA 55:821-824 (1989), for instance, hexa-histidine provides for 
convenient purification of the fusion protein. The "HA" tag is another peptide useful 
for purification which corresponds to an epitope derived from the influenza 

10 hemagglutinin protein, which has been described by Wilson et al, Cell 3 7: 767 ( 1 984). 
As discussed below, other such fusion proteins include those fused to Fc at the N- or 
C-terminus. 

Sequences Encoding Signal Peptide and Secreted Portions 

According to the signal hypothesis, proteins secreted by eukaryotic cells have 

15 a signal peptide (or secretory leader sequence) which is cleaved from the complete 

polypeptide to produce a secreted portion or "mature" form of the protein. Methods 
for predicting whether a protein has a signal peptide (or "secretory leader") as well as 
the cleavage point for that leader sequence are well known in the art. See, for instance, 
von Heinje, supra. The determined amino acid sequence of several proteins of the 

20 invention, determined by translation of the determined nucleotide sequence identified 
in Table 1 , have been found to comprise an amino acid sequence of a secretory signal 
peptide. The sequence and cleavage site of that signal peptide are described in Table 1 
and in the Examples and the signal sequence is underlined in the Figures, to the extent 
that these have been determined for each secreted protein of the invention. 

25 More in particular, the present invention provides nucleic acid molecules 

encoding a secreted portion (mature form) of each secreted protein identified in Table 
1 . Most mammalian cells and even insect cells cleave signal peptides from secreted 
proteins with approximately the same specificity . However, in some cases, cleavage 
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of the signal peptide (as referred to herein as a "leader sequence" or "leader") from a 
secreted protein is not entirely uniform, which results in more than one secreted (also 
herein "mature") for or species of the protein. Further, it has long been known that the 
cleavage specificity of a secreted protein is ultimately determined by the primary 
structure of the complete protein, that is, it is inherent in the amino acid sequence of 
the initial polypeptide translated from its mRNA. Therefore, the present invention 
provides not only a determined nucleotide sequence and translated amino acid 
sequence identifying a signal peptide and secreted portion of each secreted protein of 
the invention, but also a deposited sample of a cDNA clone encoding a secreted 
(mature) form of each secreted protein of the invention. 

More particularly, the invention further provides an isolated polypeptide 
comprising an amino acid sequence at least 90% identical, preferably 95%, 96%, 
97%, 98% or 99% identical, to a sequence of at least about 25 , 50 or 100 contiguous 
amino acids in the complete amino acid sequence of a protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for that cDNA clone in Table 1 . A 
particularly preferred embodiment of this aspect of the invention is a polypeptide in 
which the sequence of contiguous amino acids is included in the amino acid sequence 
of a secreted portion of a secreted protein encoded by a human cDNA clone identified 
by a cDNA Clone Identifier in Table 1 and contained in the deposit with the ATCC 
Deposit Number shown for said cDNA clone in Table 1 . By the "secreted portion [or 
mature form] of a secreted protein encoded by a human cDNA clone identified by a 
cDNA Clone Identifier in Table 1 and contained in the deposit with the ATCC 
Deposit Number shown for said cDNA clone in Table 1" is meant the secreted 
portion(s) or mature form(s) of the protein produced by expression in any eukary otic 
cell (for instance, cells of an established insect or mammalian cell line), preferably a 
human cell (for instance, cells of the well known HeLa cell line), of the complete open 
reading frame encoded by the human cDNA clone identified in Table 1 and contained 
in the deposit cited in Table 1 . 
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Variamt and Muttannt PolyHuucIle tides : 
The present invention further relates to variants of the nucleic acid molecules 
of the present invention, which encode portions, analogs or derivatives of the secreted 
proteins. Variants may occur naturally, such as a natural allelic variant By an "allelic 

5 variant" is intended one of several alternate forms of a gene occupying a given locus on 
a chromosome of an organism. Genes II, Lewin, B., ed.» John Wiley & Sons, New 
York (1985). Non-naturally occurring variants may be produced using art-known 
mutagenesis techniques. 

Such variants include those produced by nucleotide substitutions, deletions or 

10 additions. The substitutions, deletions or additions may involve one or more 

nucleotides. The variants may be altered in coding regions, non-coding regions, or 
both. Alterations in the coding regions may produce conservative or non-conservative 
amino acid substitutions, deletions or additions. Especially preferred among these are 
silent substitutions, additions and deletions, which do not alter the properties and 

15 activities of the secreted protein or portions thereof. Also especially preferred in this 
regard are conservative substitutions. 

Most highly preferred are nucleic acid molecules encoding a secreted portion 
(mature form) of a protein described in Table 1 and having the amino acid sequence 
shown in the sequence listing as SEQ ID NO:X, or the amino acid sequence of the 

20 secreted portion (mature form) of the protein encoded by a deposited cDNA clone . 
Further embodiments include an isolated nucleic acid molecule comprising a 
polynucleotide having a nucleotide sequence at least 85% identical, more preferably at 
least 90% identical, and most preferably at least 95%, 96%, 97%, 98% or 99% 
identical to a polynucleotide of the invention described in Table 1, or a polynucleotide 

25 which hybridizes under stringent hybridization conditions to such a polynucleotide. 
This polynucleotide which hybridizes does not hybridize under stringent 
hybridization conditions to a polynucleotide having a nucleotide sequence consisting 
of only A residues or of only T residues. An additional nucleic acid embodiment of 
the invention relates to an isolated nucleic acid molecule comprising a polynucleotide 
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which encodes the amino acid sequence of an epitope-bearing portion of a secreted 
polypeptide having an amino acid sequence of SEQ ID NO: Y or an amino acid 
sequence of a secreted protein encoded by a cDNA clone in the deposit identified in 
Table 1. 

By a polynucleotide having a nucleotide sequence at least, for example, 95% 
"identical" to a reference nucleotide sequence encoding a secreted polypeptide is 
intended that the nucleotide sequence of the polynucleotide is identical to the 
reference sequence except that the polynucleotide sequence may include up to five 
point mutations per each 100 nucleotides of the reference nucleotide sequence 
encoding the secreted polypeptide. In other words, to obtain a polynucleotide having 
a nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 
5% of the nucleotides in the reference sequence may be deleted or substituted with 
another nucleotide, or a number of nucleotides up to 5% of the total nucleotides in the 
reference sequence may be inserted into the reference sequence. These mutations of 
the reference sequence may occur at the 5' or 3' terminal positions of the reference 
nucleotide sequence or anywhere between those terminal positions, interspersed either 
individually among nucleotides in the reference sequence or in one or more contiguous 
groups within the reference sequence. 

As a practical matter, whether any particular nucleic acid molecule is at least 
85%, 90%, 95%, 96%, 97%, 98% or 99% identical to, for instance, the nucleotide 
sequence shown in SEQ ID NO:l, or to the nucleotide sequence of a deposited cDNA 
can be determined conventionally using known computer programs such as the Bestfit 
program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive, Madison, WI 53711). 
Bestfit uses the local homology algorithm of Smith and Waterman, Advances in 
Applied Mathematics 2:482-489 (1981), to find the best segment of homology between 
two sequences. When using Bestfit or any other sequence alignment program to 
determine whether a particular sequence is, for instance, 95% identical to a reference 
sequence according to the present invention, the parameters are set, of course, such 
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that the percentage of identity is calculated over the full length of the reference 
nucleotide sequence and that gaps in homology of up to 5% of the total number of 
nucleotides in the reference sequence are allowed. 

Uses for Nucleic Acid Molecules of the Invention 
5 Each of the nucleic acid molecules identified herein can be used in numerous 

ways as polynucleotide reagents. The polynucleotides can be used as diagnostic 
probes for the presence of a specific mRNA in a particular cell type. In addition, 
these polynucleotides can be used as diagnostic probes suitable for use in genetic 
linkage analysis (polymorphisms) . Further, the polynucleotides can be used as 
10 probes for locating gene regions associated with genetic disease, as explained in more 
detail below. 

The polynucleotides of the present invention are also valuable for chromosome 
identification. Each polynucleotide is specifically targeted to and can hybridize with a 
particular location oh an individual human chromosome. Moreover, there is a current 

15 need for identifying particular sites on the chromosome. Few chromosome marking 

reagents based on actual sequence data (repeat polymorphisms) are presently available 
for marking chromosomal location. The mapping of cDNAs to chromosomes 
according to the present invention is an important first step in correlating those 
sequences with genes associated with disease. 

20 Briefly, sequences can be mapped to chromosomes by preparing PCR primers 

(preferably 15-25 bp) from the sequences shown in the sequence listing. Computer 
analysis of the sequences is used to rapidly select primers that do not span more than 
one exon in the genomic DNA, thus complicating the amplification process. These 
primers are then used for PCR screening of somatic cell hybrids containing individual 

25 human chromosomes. Only those hybrids containing the human gene corresponding 
to the secreted protein will yield an amplified fragment. 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a 
particular nucleic acid sequence to a particular chromosome. Three or more clones can 
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be assigned per day using a single thermal cycler. Using the present invention with 
the same oligonucleotide primers, sublocalization can be achieved with panels of 
fragments from specific chromosomes or pools of large genomic clones in an analogous 
manner. Other mapping strategies that can similarly be used to map a gene to its 
5 chromosome include in situ hybridization, prescreening with labeled flow-sorted 

chromosomes and preselection by hybridization to construct chromosome specific- 
cDNA libraries. 

Fluorescence in situ hybridization (FISH) of a cDNA clone to a metaphase 
chromosomal spread can be used to provide a precise chromosomal location in one 

10 step. This technique can be used with cDNA as short as 500 or 600 bases; however, 
clones larger than 2,000 bp have a higher likelihood of binding to a unique 
chromosomal location with sufficient signal intensity for simple detection. For 
example, 2,000 bp is good, 4,000 is better, and more than 4,000 is probably not 
necessary to get good results a reasonable percentage of the time. For a review of this 

15 technique, see Verma et al., Human Chromosomes: a Manual of Basic Techniques. 
Pergamon Press, New York (1988). 

Reagents for chromosome mapping can be used individually (to mark a single 
chromosome or a single site on that chromosome) or as panels of reagents (for marking 
multiple sites and/or multiple chromosomes). Reagents corresponding to noncoding 

20 regions of the genes actually are preferred for mapping purposes. Coding sequences 
are more likely to be conserved within gene families, thus increasing the chance of 
cross hybridizations during chromosomal mapping. 

Once a polynucleotide sequence has been mapped to a precise chromosomal 
location, the physical position of the sequence on the chromosome can be correlated 

25 with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian 
Inheritance in Man (available on line through Johns Hopkins University Welch 
Medical Library) .) The relationship between genes and diseases that have been 
mapped to the same chromosomal region are then identified through linkage analysis 
(coinheritance of physically adjacent genes). 
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Next, it is necessary to determine the differences in the cDNA or genomic 
sequence between affected and unaffected individuals. If a mutation is observed in 
some or all of the affected individuals but not in any normal individuals, then the 
mutation is likely to be the causative agent of the disease. 

5 With current resolution of physical mapping and genetic mapping techniques, 

a cDNA precisely localized to a chromosomal region associated with the disease could 
be one of between 50 and 500 potential causative genes. (This assumes 1 megabase 
mapping resolution and one gene per 20 kb.) 

Comparison of affected and unaffected individuals generally involves first 

10 looking for structural alterations in the chromosomes, such as deletions or 

translocations that are visible from chromosome spreads or detectable using PCR 
based on that cDNA sequence. Ultimately, complete sequencing of genes from several 
individuals is required to confirm the presence of a mutation and to distinguish 
mutations from polymorphisms. 

1 5 In addition to the foregoing, the polynucleotides of the invention, as broadly 

described, can be used to control gene expression through triple helix formation or 
antisense DNA or RNA, both of which methods are based on binding of a 
polynucleotide sequence to DNA or RNA. Polynucleotides suitable for use in these 
methods are usually 20 to 40 bases in length and are designed to be complementary to 

20 a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids 
Res., 6:3073 (1979); Cooney et al, Science, 241 :456 (1988) ; and Dervan et al, Science, 
251 : 1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem., 56:560 
(1991) Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC 
Press, Boca Raton, FL (1988)). Triple helix formation optimally results in a shut-off 

25 of RNA transcription from DNA, while antisense RNA hybridization blocks 

translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 
sequences of the present invention is necessary for the design of an antisense or triple 
helix oligonucleotide. 
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Nucleic acid molecules of the present invention are also a useful in gene 
therapy which requires isolation of the disease-associated gene in question as a 
prerequisite to the insertion of a normal gene into an organism to correct a genetic 
defect The high specificity of the cDNA probes according to this invention offer 
means of targeting such gene locations in a highly accurate manner. 

The sequences of the present invention, as broadly defined, are also useful for 
identification of individuals from minute biological samples. The United States 
military, for example, is considering the use of restriction fragment length 
polymorphism (RFLP) for identification of its personnel. In this technique, an 
individual's genomic DNA is digested with one or more restriction enzymes, and 
probed on a Southern blot to yield unique bands for identifying personnel. This 
method does not suffer from the current limitations of "Dog Tags 11 which can be lost, 
switched, or stolen, making positive identification difficult. The sequences of the 
present invention are useful as additional DNA markers for RFLP. 

However, RFLP is a pattern based technique, which does not require the DNA 
sequence of the individual to be sequenced. The polynucleotides and sequences of the 
present invention can be used to provide an alternative technique that determines the 
actual base-by-base DNA sequence of selected portions of an individual's genome. 
These sequences can be used to prepare PCR primers for amplifying and isolating 
such selected DNA. One can, for example, take a sequence of the invention and 
prepare two PCR primers. These are used to amplify an individual's DNA, 
corresponding to the gene or gene fragment The amplified DNA is sequenced. 

Panels of corresponding DNA sequences from individuals, made this way, can 
provide unique individual identifications, as each individual will have a unique set of 
such DNA sequences, due to allelic differences. The sequences of the present 
invention can be used to particular advantage to obtain such identification sequences 
from individuals and from tissue, as further described in the Examples. The 
polynucleotide sequences shown in the sequence listing and the inserts contained in 
the deposited cDNAs uniquely represent portions of the human genome. Allelic 
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variation occurs to some degree in the coding regions of these sequences, and to a 
greater degree in the noncoding regions. It is estimated that allelic variation between 
individual humans occurs with a frequency of about once per each 500 bases. Each of 
the sequences comprising a part of the present invention can, to some degree, be used 
as a standard against which DNA from an individual can be compared for 
identification purposes. Because greater numbers of polymorphisms occur in the 
noncoding regions, fewer sequences are necessary to differentiate individuals. 

If a panel of reagents from sequences of this invention is used to generate a 
unique ID database for an individual, those same reagents can later be used to identify 
tissue from that individual. Positive identification of that individual, living or dead can 
be made from extremely small tissue samples. 

Another use for DNA-based identification techniques is in forensic biology. 
PCR technology can be used to amplify DNA sequences taken from very small 
biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, 
semen, etc. In one prior art technique, gene sequences are amplified at specific loci 
known to contain a large number of allelic variations, for example the DQa class II 
HLA gene (Erlich, H., PCR Technology, Freeman and Co. (1992)). Once this specific 
area of the genome is amplified, it is digested with one or more restriction enzymes to 
yield an identifying set of bands on a Southern blot probed with DNA corresponding 
to the DQa class II HLA gene. 

The sequences of the present invention can be used to provide polynucleotide 
reagents specifically targeted to additional loci in the human genome, and can enhance 
the reliability of DNA-based forensic identifications. Those sequences targeted to 
noncoding regions are particularly appropriate. As mentioned above, actual base 
sequence information can be used for identification as an accurate alternative to 
patterns formed by restriction enzyme generated fragments. Reagents for obtaining 
such sequence information are within the scope of the present invention. Such 
reagents can comprise complete genes, ESTs or corresponding coding regions, or 
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fragments of either of at least 20 bp, preferably at least 50 bp, most preferably at least 
500 to 1,000 bp. 

There is also a need for reagents capable of identifying the source of a 
particular tissue. Such need arises, for example, in forensics when presented with 
5 tissue of unknown origin. Appropriate reagents can comprise, for example, DNA 
probes or primers specific to particular tissue prepared from the sequences of the 
present invention. Panels of such reagents can identify tissue by species and/or by 
organ type. In a similar fashion, these reagents can be used to screen tissue cultures 
for contamination. 

10 The present application is directed to nucleic acid molecules at least 85%, 

90%, 95%, 96%, 97%, 98% or 99% identical to a nucleic acid sequence referenced in 
Table 1 and shown in the sequence listing or to the nucleic acid sequence of a 
deposited cDNA, irrespective of whether they encode a polypeptide having biological 
activity. This is because even where a particular nucleic acid molecule does not 

15 encode a polypeptide having biological activity, one of skill in the art would still know 
how to use the nucleic acid molecule, for instance, for one of the uses above. 

Preferred, however, are nucleic acid molecules having sequences at least 85%, 
90%, 95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in 
Figure 1 (SEQ ID NO:l) or to the nucleic acid sequence of the deposited cDNA which 

20 do, in fact, encode a secreted polypeptide having biological activity. By "a 

polypeptide having biological activity" is intended polypeptides exhibiting activity 
similar, but not necessarily identical, to an activity of the mature protein of the 
invention, as measured in a particular biological assay. "A polypeptide having 
biological activity" includes polypeptides that also exhibit any of the same activities 

25 as a protein of the invention in an assay in a dose-dependent manner. Although the 
degree of dose-dependent activity need not be identical to that of the protein, 
preferably, "a polypeptide having biological activity" will exhibit substantially similar 
dose-dependence in a given activity as compared to the protein (i.e., the candidate 
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polypeptide will exhibit greater activity or not more than about 25-fold less and, 
preferably, not more than about tenfold less activity relative to the reference protein). 

Of course, due to the degeneracy of the genetic code, one of ordinary skill in 
the art will immediately recognize that a large number of the nucleic acid molecules 

5 having a sequence at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to the 
nucleic acid sequence of the deposited cDNA or the nucleic acid sequence shown in 
the sequence listing will encode a polypeptide "having biological activity." In fact, 
since degenerate variants of these nucleotide sequences all encode the same 
polypeptide, this will be clear to the skilled artisan even without performing the 

10 comparison assay. It will be further recognized in the art that, for such nucleic acid 
molecules that are not degenerate variants, a reasonable number will also encode a 
polypeptide having biological activity. This is because the skilled artisan is fully 
aware of amino acid substitutions that are either less likely or not likely to 
significantly affect protein function (e.g., replacing one aliphatic amino acid with a 

15 second aliphatic amino acid), as further described below. 

Vectors, Host Cells and Protein Production 

The present invention also relates to vectors which include the isolated DNA 
molecules of the present invention, host cells which are genetically engineered with the 
recombinant vectors, and the production of polypeptides or fragments thereof by 

20 recombinant techniques. The vector may be, for example, a phage, plasmid, viral or 
retroviral vector. Retroviral vectors may be replication competent or replication 
defective. In the latter case, viral propagation generally will occur only in 
complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker 

25 for propagation in a host. Generally, a plasmid vector is introduced in a precipitate, 
such as a calcium phosphate precipitate, or in a complex with a charged lipid. If the 
vector is a virus, it may be packaged in vitro using an appropriate packaging cell line 
and then transduced into host cells. 
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The DNA insert should be operatively linked to an appropriate promoter, 
such as the phage lambda PL promoter, the E. coli lac, trp,phoA and tac promoters, 
the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. 
Other suitable promoters will be known to the skilled artisan. The expression 
5 constructs will further contain sites for transcription initiation, termination and, in the 
transcribed region, a ribosome binding site for translation. The coding portion of the 
transcripts expressed by the constructs will preferably include a translation initiating 
codon at the beginning and a termination codon (UAA, UGA or UAG) appropriately 
positioned at the end of the polypeptide to be translated. 

10 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such markers include dihydrofolate reductase, G41 8 or neomycin 
resistance for eukaiyotic cell culture and tetracycline, kanamycin or ampicillin 
resistance genes for culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 

15 Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; 

insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, 
COS, 293 and Bowes melanoma cells; and plant cells. Appropriate culture mediums 
and conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and 

20 pQE-9, available from QIAGEN, Inc., supra; pBluescript vectors, Phagescript 

vectors, pNH8A, pNH16a, pNH18A, pNH46A, available from Stratagene Cloning 
Systems, Inc.; and ptrc99a, pKK223-3, pKK233-3, pDR540, pRITS available from 
Pharmacia Biotech, Inc. Among preferred eukaryotic vectors are pWLNEO, 
pSV2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3 ? pBPV, 

25 pMSG and pS VL available from Pharmacia. Other suitable vectors will be readily 
apparent to the skilled artisan. 

Introduction of the construct into the host cell can be effected by calcium 
phosphate transection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection or other methods. Such methods 
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are described in many standard laboratory manuals, such as Davis et aL, Basic 
Methods In Molecular Biology (1986). 

The polypeptide may be expressed in a modified form, such as a fusion 
protein, and may include not only secretion signals, but also additional heterologous 

5 functional regions. For instance, a region of additional amino acids, particularly 

charged amino acids, may be added to the N-terminus of the polypeptide to improve 
stability and persistence in the host cell, during purification, or during subsequent 
handling and storage. Also, peptide moieties may be added to the polypeptide to 
facilitate purification. Such regions may be removed prior to final preparation of the 

10 ✓ polypeptide. The addition of peptide moieties to polypeptides to engender secretion 
or excretion, to improve stability and to facilitate purification, among others, are 
familiar and routine techniques in the art. A preferred fusion protein comprises a 
heterologous region from immunoglobulin that is useful to stabilize and purify 
proteins. For example, EP-A-0 464 533 (Canadian counterpart 2045869) discloses 

15 fusion proteins comprising various portions of constant region of immunoglobulin 

molecules together with another human protein or part thereof In many cases, the Fc 
part in a fusion protein is thoroughly advantageous for use in therapy and diagnosis 
and thus results, for example, in improved pharmacokinetic properties (EP-A 0232 
262). On the other hand, for some uses it would be desirable to be able to delete the 

20 Fc part after the fusion protein has been expressed, detected and purified in the 
advantageous manner described. This is the case when Fc portion proves to be a 
hindrance to use in therapy and diagnosis, for example when the fusion protein is to 
be used as antigen for immunizations. In drug discovery, for example, human 
proteins, such as hlL-5, have been fused with Fc portions for the purpose of 

25 high-throughput screening assays to identify antagonists of hIL-5. See, D. Bennett et 
al, J. Molecular Recognition 5:52-58 (1995) and K. Johanson et aL, J. BioL Chem. 
270:9459-9471 (1995). 

A protein of this invention can be recovered and purified from recombinant cell 
cultures by well-known methods including ammonium sulfate or ethanol precipitation, 
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acid extraction, anion or cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, affinity chromatography, 
hydroxylapatite chromatography and lectin chromatography. Most preferably, high 
performance liquid chromatography ("HPLC") is employed for purification. 

5 Polypeptides of the present invention include: products purified from natural 

sources, including bodily fluids, tissues and cells, whether directly isolated or cultured; 
products of chemical synthetic procedures; and products produced by recombinant 
techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, 
yeast, higher plant, insect and mammalian cells. Depending upon the host employed 

10 in a recombinant production procedure, the polypeptides of the present invention 
may be glycosylated or may be non-glycosylated. In addition, polypeptides of the 
invention may also include an initial modified methionine residue, in some cases as a 
result of host-mediated processes. Thus, it is well known in the art that the 
N-terminal methionine encoded by the translation initiation codon generally is 

15 removed with high efficiency from any protein after translation in all eukaryotic cells. 
While the N-terminal methionine on most proteins also is efficiently removed in most 
prokaryotes, for some proteins this prokaryotic removal process is inefficient, 
depending on the nature of the amino acid to which the N-terminal methionine is 
covalently linked. 

20 Polypeptides and Fragments 

The invention further provides isolated polypeptides having an amino acid 
sequence encoded by a deposited cDNA, or an amino acid sequence in the sequence 
listing identified SEQ ID NO:Y as defined in Table 1, or a peptide or polypeptide 
comprising a portion of the above polypeptides. At the simplest level, the amino acid 

25 sequence can be synthesized using commercially available peptide synthesizers. This 
is particularly useful in producing small peptides and fragments of larger 
polypeptides. Such fragments are useful, for example, in generating antibodies against 
the native polypeptide. 



i 
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Variant and Mutant Polypeptides 

To improve or alter the characteristics of the polypeptides of the invention, 
protein engineering may be employed. Recombinant DNA technology known to 
those skilled in the art can be used to create novel mutant proteins or "muteins" 
including single or multiple amino acid substitutions, deletions, additions or fusion 
proteins. Such modified polypeptides can show, e.g., enhanced activity or increased 
stability. In addition, they may be purified in higher yields and show better solubility 
than the corresponding natural polypeptide, at least under certain purification and 
storage conditions. 

For instance, for many proteins, including the mature form(s) of a secreted 
protein, it is known in the art that one or more amino acids may be deleted from the 
N-terminus or C-terminus without substantial loss of biological function. For 
instance, Ron et al., 7. Biol Chem., 265:2984-2988 (1993) reported modified KGF 
proteins that had heparin binding activity even if 3, 8, or 27 amino-terminal amino acid 
residues were missing. Similarly, many examples of biologically functional C-terminal 
deletion muteins are known. For instance, Interferon gamma shows up to ten times 
higher activities by deleting 8-10 amino acid residues from the carboxy terminus of the 
protein (Dobeli et al., J. Biotechnology 7:199-216 (1988). Furthermore, even if 
deletion of one or more amino acids from the N-terminus or C-terminus of a protein 
results in modification or loss of one or more biological functions of the protein, other 
biological activities may still be retained. Thus, the ability of the shortened protein to 
induce and/or bind to antibodies which recognize the complete or mature form of the 
protein generally will be retained when less than the majority of the residues of the 
complete or mature form of the protein are removed from the N-terminus or C- 
terminus. Whether a particular polypeptide lacking N- or C-terminal residues of a 
complete protein retains such immunologic activities can readily be determined by 
routine methods described herein and otherwise known in the art. 

In addition to terminal deletion forms of the protein discussed above, it also 
will be recognized by one of ordinary skill in the art that some amino acid sequences 
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of a polypeptide can be varied without significant effect of the structure or function of 
the protein. If such differences in sequence are contemplated, it should be 
remembered that there will be critical areas on the protein which determine activity. 

Thus, the invention further includes variants of a polypeptide which show 
substantial biological activity or which include regions of the protein such as the 
portions discussed below. Such mutants include deletions, insertions, inversions, 
repeats, and type substitutions selected according to general rules known in the art so 
as have little effect on activity. For example, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al., 
"Deciphering the Message in Protein Sequences: Tolerance to Amino Acid 
Substitutions, 1 ' Science 247:1306-1310 (1990), wherein the authors indicate that there 
are two main approaches for studying the tolerance of an amino acid sequence to 
change. The first method relies on the process of evolution, in which mutations are 
either accepted or rejected by natural selection. The second approach uses genetic 
engineering to introduce amino acid changes at specific positions of a cloned gene and 
selections or screens to identify sequences that maintain functionality. 

As the authors state, these studies have, revealed that proteins are surprisingly 
tolerant of amino acid substitutions. The authors further indicate which amino acid 
changes are likely to be permissive at a certain position of the protein. For example, 
most buried amino acid residues require nonpolar side chains, whereas few features of 
surface side chains are generally conserved. Other such phenotypically silent 
substitutions are described in Bowie, J. U. et al, supra, and the references cited 
therein. Typically seen as conservative substitutions are the replacements, one for 
another, among the aliphatic amino acids Ala, Val, Leu and He; interchange of the 
hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu. 
substitution between the amide residues Asn and Gin, exchange of the basic residues 
Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Thus, the fragment, derivative or analog of a polypeptide shown in the figures 
(and sequence listing), or one encoded by the deposited cDNA, may be (i) one in 
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which one or more of the amino acid residues are substituted with a conserved or non- 
conserved amino acid residue (preferably a conserved amino acid residue) and such 
substituted amino acid residue may or may not be one encoded by the genetic code, or 
(ii) one in which one or more of the amino acid residues includes a substituent group, 

5 or (iii) one in which the mature polypeptide is fused with another compound, such as 
a compound to increase the half-life of the polypeptide (for example, polyethylene 
glycol), or (iv) one in which the additional amino acids are fused to the above form of 
the polypeptide, such as an IgG Fc fusion region peptide or leader or secretory 
sequence or a sequence which is employed for purification of the above form of the 

10 polypeptide or a proprotein sequence. Such fragments, derivatives and analogs are 
deemed to be within the scope of those skilled in the art from the teachings herein 
Thus, the mature polypeptide of the present invention may include one or 
more amino acid substitutions, deletions or additions, either from natural mutations or 
human manipulation. As indicated, changes are preferably of a minor nature, such as 

15 conservative amino acid substitutions that do not significantly affect the folding or 
activity of the protein (see Table 2). 
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TABLE 2. CONSERVATIVE AMINO ACID SUBSTITUTIONS 



Aromatic 


Phenylalanine 




Tryptophan 




Tyrosine 


Hydrophobic 


Leucine 




Isoleucine 




Vol in** 


Polar 


Glutamine 






Basic 


Arginine 




Lysine 




Histidine 


Acidic 


Aspartic Acid 




Glutamic Acid 


Small 


Alanine 




Serine 




Threonine 




Methionine 




Glycine 



Amino acids in the protein of the present invention that are essential for 
function can be identified by methods known in the art, such as site-directed 
mutagenesis or alanine-scanning mutagenesis (Cunningham and Wells, Science 
244:1081-1085 (1989)). The latter procedure introduces single alanine mutations at 
every residue in the molecule. The resulting mutant molecules are then tested for 
biological activity such as receptor binding or in vitro or in vitro proliferative activity. 

Of special interest are substitutions of charged amino acids with other charged 
or neutral amino acids which may produce proteins with highly desirable improved 
characteristics, such as less aggregation. Aggregation may not only reduce activity but 
also be problematic when preparing pharmaceutical formulations, because aggregates 
can be immunogenic (Pinckard et al t Clin. Exp. Immunol 2:331-340 (1967); Robbins 
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et ah, Diabetes 36: 838-845 (1987); Cleland etal., Crit Rev. Therapeutic Drug 
Carrier Systems 70:307-377 (1993). 

Replacement of amino acids can also change the selectivity of the binding of a 
ligand to cell surface receptors. For example, Ostade et al, Nature 367:266-268 
(1993) describes certain mutations resulting in selective binding of TNF-ot to only one 
of the two known types of TNF receptors. Sites that are critical for ligand-receptor 
binding can also be determined by structural analysis such as crystallization, nuclear 
magnetic resonance or photoaffinity labeling (Smith et al, J. Mol Biol 22^:899-904 
(1992) and de Vos et al Science 255:306-3 12 (1992)). 

The polypeptides of the present invention are preferably provided in an 
isolated form, and preferably are substantially purified. A recombinant^ produced 
version of a polypeptide of the invention can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67:3 1-40 (1988). 
Polypeptides of the invention also can be purified from natural or recombinant 
sources using antibodies of the invention raised against the protein in methods which 
are well known in the art of protein purification. 

Further polypeptides of the present invention include polypeptides which 
have at least 90% similarity, more preferably at least 95% similarity, and still more 
preferably at least 96%, 97%, 98% or 99% similarity to those described above. The 
polypeptides of the invention also comprise those which are at least 80% identical, 
more preferably at least 90% or 95% identical, still more preferably at least 96%, 
97%, 98% or 99% identical to a polypeptide encoded by a deposited cDNA or to the 
polypeptide of SEQ ID NO: Y, and also include portions of such polypeptides with at 
least 30 amino acids and more preferably at least 50 amino acids. 

By "% similarity" for two polypeptides is intended a similarity score 
produced by comparing the amino acid sequences of the two polypeptides using the 
Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive. Madison, WI 5371 1) 
and the default settings for determining similarity. Bestfit uses the local homology 
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algorithm of Smith and Waterman (Advances in Applied Mathematics 2:482-489, 
1981) to find the best segment of similarity between two sequences. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a reference amino acid sequence of a polypeptide described herein is 
intended that the amino acid sequence of the polypeptide is identical to the reference 
sequence except that the polypeptide sequence may include up to five amino acid 
alterations per each 100 amino acids of the reference amino acid of the polypeptide of 
the invention. In other words, to obtain a polypeptide having an amino acid sequence 
at least 95% identical to a reference amino acid sequence, up to 5% of the amino acid 
residues in the reference sequence may be deleted or substituted with another amino 
acid, or a number of amino acids up to 5% of the total amino acid residues in the 
reference sequence may be inserted into the reference sequence. These alterations of 
the reference sequence may occur at the amino or carboxy terminal positions of the 
reference amino acid sequence or anywhere between those terminal positions, 
interspersed either individually among residues in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 
95%, 96%, 97%, 98% or 99% identical to, for instance, an amino acid sequence shown 
in the sequence listing or to an amino acid sequence encoded by the deposited cDNA 
can be determined conventionally using known computer programs such the Bestfit 
program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive, Madison, WI 5371 1). 
When using Bestfit or any other sequence alignment program to determine whether a 
particular sequence is, for instance, 95% identical to a reference sequence according to 
the present invention, the parameters are set, of course, such that the percentage of 
identity is calculated over the full length of the reference amino acid sequence and that 
gaps in homology of up to 5% of the total number of amino acid residues in the 
reference sequence are allowed. 
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The polypeptide of the present invention could be used as a molecular weight 
marker on SDS-PAGE gels or on molecular sieve gel filtration columns using methods 
well known to those of skill in the art. 

As described in detail below, the polypeptides of the present invention can 
also be used to raise polyclonal arid monoclonal antibodies, which are useful in assays 
for detecting the corresponding protein expression as described below or as agonists 
and antagonists capable of enhancing or inhibiting function of the protein. Further, 
such polypeptides can be used in the yeast two-hybrid system to "capture" receptors 
of secreted proteins which are also candidate agonists and antagonists according to the 
present invention. The yeast two hybrid system is described in Fields and Song, 
Nature 340:245-246 (1 989). 

Epitope-Bearing Portions 

In another aspect, the invention provides a peptide or polypeptide comprising 
an epitope-bearing portion of a polypeptide of the invention. The epitope of this 
polypeptide portion is an immunogenic or antigenic epitope of a polypeptide of the 
invention. An "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response when the whole protein is the immunogen. On the other hand, a 
region of a protein molecule to which an antibody can bind is defined as an "antigenic 
epitope." The number of immunogenic epitopes of a protein generally is less than the 
number of antigenic epitopes. See, for instance, Geysen et al, Proc. Natl Acad Set 
USA 57:3998- 4002(1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope 
(i.e., that contain a region of a protein molecule to which an antibody can bind), it is 
well known in that art that relatively short synthetic peptides that mimic part of a 
protein sequence are routinely capable of eliciting an antiserum that reacts with the 
partially mimicked protein. See, for instance, Sutcliffe, J. G., Shinnick, T. M., Green, 
N. and Learner, R. A. (1983) "Antibodies that react with predetermined sites on 
proteins," Science, 2/9:660-666. Peptides capable of eliciting protein-reactive sera are 



WO 98/31800 



-39- 



PCT/US98/00960 



frequently represented in the primary sequence of a protein, can be characterized by a 
set of simple chemical rules, and are confined neither to immunodominant regions of 
intact proteins (i.e., immunogenic epitopes) nor to the amino or carboxyl terminals. 
Antigenic epitope-bearing peptides and polypeptides of the invention are therefore 
useful to raise antibodies, including monoclonal antibodies, that bind specifically to a 
polypeptide of the invention. See, for instance, Wilson et al, Cell 37:767-77% (1984) 
at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention 
preferably contain a sequence of at least seven, more preferably at least nine and most 
preferably between about 15 to about 30 amino acids contained within the amino acid 
sequence of a polypeptide of the invention. 

The epitope-bearing peptides and polypeptides of the invention may be 
produced by any conventional means. See, e.g., Houghten, R. A. (1985) "General 
method for the rapid solid-phase synthesis of large numbers of peptides: specificity 
of antigen-antibody interaction at the level of individual amino acids." Proc. Natl 
Acad. Sci. USA 52:5131-5135; this "Simultaneous Multiple Peptide Synthesis 
(SMPS)' 1 process is further described in U.S. Patent No. 4,63 1,21 1 to Houghten et al 
(1986). 

Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et 
al., supra; Wilson et al., supra; Chow, M. et al, Proc. Natl Acad ScL USA 
52:910-914; and Bittle, F. J. et al, J. Gen. Virol 55:2347-2354(1985). Immunogenic 
epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an 
antibody response when the whole protein is the immunogen, are identified according 
to methods known in the art. See, for instance, Geysen et al.. supra. Further still, 
U.S. Patent No. 5,194,392 to Geysen (1990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is 
a topological equivalent of the epitope (i.e., a "mimotope") which is complementary 
to a particular paratope (antigen binding site) of an antibody of interest. More 



WO 98/31800 



-40- 



PC1YUS98/00960 



generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of 
detecting or determining a sequence of monomers which is a topographical equivalent 
of a ligand which is complementary to the ligand binding site of a particular receptor 
of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on 
5 Peralkylated Oligopeptide Mixtures discloses linear C 1 -C7-alkyl peralkylated 

oligopeptides and sets and libraries of such peptides, as well as methods for using 
such oligopeptide sets and libraries for determining the sequence of a peralkylated 
oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, 
non-peptide analogs of the epitope-bearing peptides of the invention also can be made 
10 routinely by these methods. 

Fusion Proteins 

As one of skill in the art will appreciate, polypeptides of the present invention 
and the epitope-bearing fragments thereof described above can be combined with parts 
of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides. 

15 These fusion proteins facilitate purification and show an increased half-life in vivo. 

This has been shown, e.g., for chimeric proteins consisting of the first two domains of 
the human CD4-polypeptide and various domains of the constant regions of the 
heavy or light chains of mammalian immunoglobulins (EP A 394.827; Traunecker et 
al., Nature 33 1 :84-86 (1 988)). Fusion proteins that have a disulfide-linked dimeric 

20 structure due to the IgG part can also be more efficient in binding and neutralizing 

other molecules than the monomeric secreted protein or protein fragment alone 
(Fountoulakis et aU J. Biochem. 270:3958-3964 (1995)). 

Antibodies 

Protein-species specific antibodies for use in the present invention can be 
25 raised against an intact protein or an antigenic polypeptide fragment thereof, which 
may be presented together with a carrier protein, such as an albumin, to an animal 
system (such as rabbit or mouse) or, if it is long enough (at least about 25 amino 
acids), without a carrier. 



WO 98/31800 



-41- 



PCT/US98/00960 



As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 
meant to include intact molecules as well as antibody fragments (such as, for example, 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab f )2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 

5 the circulation, and may have less non-specific tissue binding of an intact antibody 
(Wahl et al, 1 Nucl Med 24:316-325 (1983)). Thus, these fragments are preferred. 

The antibodies of the present invention may be prepared by any of a variety 
of methods. For example, cells expressing the protein of the present invention or an 
antigenic fragment thereof can be administered to an animal in order to induce the 

10 production of sera containing polyclonal antibodies. In a preferred method, a 

preparation of the secreted protein is prepared and purified to render it substantially 
free of natural contaminants. Such a preparation is then introduced into an animal in 
order to produce polyclonal antisera of greater specific activity. 

In the most preferred method, the antibodies of the present invention are 

15 monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 

antibodies can be prepared using hybridoma technology (Kohler et al, Nature 256'A95 
(1975); Kohler et al, Eur. J. Immunol 6:5\l (1976); Kohler et aL, Eur. J. Immunol. 
6:292 (1976); Hammerling etal t in: Monoclonal Antibodies and T-Cell Hybridomas, 
Elsevier, N.Y., (1981) pp. 563-681 ). In general, such procedures involve immunizing 

20 an animal (preferably a mouse) with a protein antigen of the invention or, more 

preferably, with a protein-expressing cell. Such cells may be cultured in any suitable 
tissue culture medium; however, it is preferable to culture cells in Earle's modified 
Eagle's medium supplemented with 10% fetal bovine serum (inactivated at about 56° 
C), and supplemented with about 10 g/1 of nonessential amino acids, about 1,000 U/ml 

25 of penicillin, and about 100 |lg/ml of streptomycin. The splenocytes of such mice are 
extracted and fused with a suitable myeloma cell line. Any suitable myeloma cell line 
may be employed in accordance with the present invention; however, it is preferable 
to employ the parent myeloma cell line (SP20), available from the American Type 
Culture Collection, Rockville, Maryland. After fusion, the resulting hybridoma cells 
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are selectively maintained in HAT medium, and then cloned by limiting dilution as 
described by Wands et ai. (Gastroenterology 80:225-232 (1981)). The hybridoma 
cells obtained through such a selection are then assayed to identify clones which 
secrete antibodies capable of binding the protein antigen. 

5 Alternatively, additional antibodies capable of binding to the protein antigen of 

the invention may be produced in a two-step procedure through the use of 
anti-idiotypic antibodies. Such a method makes use of the fact that antibodies are 
themselves antigens, and that, therefore, it is possible to obtain an antibody which 
binds to a second antibody. In accordance with this method, protein specific 

10 antibodies are used to immunize an animal, preferably a mouse. The splenocytes of 
such an animal are then used to produce hybridoma cells, and the hybridoma cells are 
screened to identify clones which produce an antibody whose ability to bind to the 
protein-specific antibody can be blocked by the protein antigen. Such antibodies 
comprise anti-idiotypic antibodies to the protein-specific antibody and can be used to 

15 immunize an animal to induce formation of further protein-specific antibodies. 

It will be appreciated that Fab and F(ab')2 and other fragments of the 
antibodies of the present invention may be used according to the methods disclosed 
herein. Such fragments are typically produced by proteolytic cleavage, using enzymes 
such as papain (to produce Fab fragments) or pepsin (to produce F(ab ! )2 fragments). 

20 Alternatively, protein-binding fragments can be produced through the application of 
recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 

25 described above. Methods for producing chimeric antibodies are known in the art. 

See, for review, Morrison, Science 229:1202 (1985); Oi et al, BioTechniques 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et al., EP 171496; 
Morrison et aL EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
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8702671; Boulianne et al, Nature 312:643 (1984); Neuberger et al, Nature 31426% 
(1985). 

Identification and Diagnostic Applications 

Assaying protein levels in a biological sample can occur using antibody-based 
5 techniques. For example, protein expression in tissues can be studied with classical 

immunohistological methods (Jalkanen, M., et al, 1 Cell Biol 707:976-985 (1985); 
Jalkanen, M., etal, 1 Cell. Biol 705:3087-3096 (1987)). Other antibody-based 
methods useful for detecting protein gene expression include immunoassays, such as 
the enzyme linked immunosorbent assay (ELISA) and the radioimmunoassay (RI A). 

10 Suitable antibody assay labels are known in the art and include enzyme labels, such as, 

glucose oxidase, and radioisotopes, such as iodine ( 125 1, 12l I), carbon ( 14 C), sulfur ( 35 S), 
tritium ( 3 H), indium ( 1I2 In), and technetium (""'Tc), and fluorescent labels, such as 
fluorescein and rhodamine, and biotin. 

In addition to assaying protein levels in a biological sample obtained from an 

15 individual, protein can also be detected in vivo by imaging. Antibody labels or 

markers for in vivo imaging of protein include those detectable by X-radiogiaphy, 
NMR or ESR. For X-radiography, suitable labels include radioisotopes such as 
barium or cesium, which emit detectable radiation but are not overtly harmful to the 
subject. Suitable markers for NMR and ESR include those with a detectable 

20 characteristic spin, such as deuterium, which may be incorporated into the antibody 

by labeling of nutrients for the relevant hybridoma. 

A protein-specific antibody or antibody fragment which has been labeled with 
an appropriate detectable imaging moiety, such as a radioisotope (for example, 13l I, 
H2 In, " m Tc), a radio-opaque substance, or a material detectable by nuclear magnetic 

25 resonance, is introduced (for example, parenterally, subcutaneously or 

intraperitoneally) into the mammal to be examined for immune system disorder. It 
will be understood in the art that the size of the subject and the imaging system used 
will determine the quantity of , imaging moiety needed to produce diagnostic images. In 
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the case of a radioisotope moiety, for a human subject, the quantity of radioactivity 
injected will normally range from about 5 to 20 millicuriesof " m Tc. The labeled 
antibody or antibody fragment will then preferentially accumulate at the location of 
cells which contain the specific protein. In vivo tumor imaging is described in S.W. 
5 Burchiel et al., "Immunopharmacokinetics of Radiolabeled Antibodies and Their 

Fragments" (Chapter 1 3 in Tumor Imaging: The Radiochemical Detection of Cancer, 
S.W. Burchiel and B. A. Rhodes, eds., Masson Publishing Inc. (1982)). 

Treatment of Conditions Related to Proteins of the Invention 

It will be appreciated that conditions caused by a decrease in the standard or 

10 normal expression level of a protein of the invention, particularly a secreted protein, in 
an individual can be treated by administration of the polypeptide (in the form of a 
mature protein for secreted polypeptides). Thus, the invention also provides a 
method of treatment of an individual in need of an increased level of the protein of the 
present invention comprising administering to such an individual a pharmaceutical 

15 composition comprising an amount of the isolated polypeptide of the invention 
effective to increase the activity level of the protein in such an individual. 

Formulations 

Polypeptide composition will be formulated and dosed in a fashion consistent 
with good medical practice, taking into account the clinical condition of the individual 

20 patient (especially the side effects of treatment with the polypeptide alone), the site 
of delivery, the method of administration, the scheduling of administration, and other 
factors known to practitioners. The "effective amount" for purposes herein is thus 
determined by such considerations. 

As a general proposition, the total pharmaceutically effective amount of a 

25 polypeptide administered parenterally per dose will be in the range of about 1 

fXg/kg/day to 10 mg/kg/day of patient body weight, although, as noted above, this will 
be subject to therapeutic discretion. More preferably, this dose is at least 0.01 
mg/kg/day, and most preferably for humans between about 0.01 and 1 mg/kg/day for 
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the hormone. If given continuously, the polypeptide is typically administered at a 
dose rate of about 1 ^g/kg/hour to about 50 jxg/kg/hour, either by 1-4 injections per 
day or by continuous subcutaneous infusions, for example, using a mini-pump. An 
intravenous bag solution may also be employed. The length of treatment needed to 

5 observe changes and the interval following treatment for responses to occur appears to 
vary depending on the desired effect. 

Pharmaceutical compositions containing the protein of the invention may be 
administered orally, rectally, parenterally, intracistemally, intravaginally, 
intraperitoneally, topically (as by powders, ointments, drops or transdermal patch), 

10 bucally, or as an oral or nasal spray. By "pharmaceutical^ acceptable carrier' 1 is 
meant a non-toxic solid, semisolid or liquid filler, diluenV encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to 
modes of administration which include intravenous, intramuscular, intraperitoneal, 
intrastemal, subcutaneous and intraarticular injection and infusion. 

15 The polypeptide is also suitably administered by sustained-release systems. 

Suitable examples of sustained-release compositions include semi-permeable polymer 
matrices in the form of shaped articles, e.g., films, or mirocapsules. Sustained-release 
matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481); copolymers of L- 
glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et al., Biopolymers 22:547- 

20 556 (1983)), poly (2- hydroxyethyl methacrylate) (R. Langer et al., J. Biomed. Mater. 
Res. 75:167-277 (1981), and R. Langer, Chem. Tech. 72:98-105 (1982)), ethylene 
vinyl acetate (R. Langer et al., Id.) or poly-D- (-)-3-hydroxybutyric acid (EP 
133,988). Sustained-release compositions also include liposomally entrapped 
polypeptides. Liposomes containing the polypeptide are prepared by methods 

25 known per se: DE 3,218,121; Epstein et al., Proc. Natl. Acad. Sci. (USA) 82:3688- 
3692 (1985); Hwang et al., Proc. Natl. Acad. Sci. (USA) 77:4030-4034(1980); EP 
52,322; EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83- 
1 1 8008; U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the 
liposomes are of the small (about 200-800 Angstroms) unilamellar type in which the 
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lipid content is greater than about 30 mol. percent cholesterol, the selected proportion 
being adjusted for the optimal polypeptide therapy. 

For parenteral administration, in one embodiment, the polypeptide is 
formulated generally by mixing it at the desired degree of purity, in a unit dosage 

5 injectable form (solution, suspension, or emulsion), with a pharmaceutical^ 
acceptable carrier, i.e., one that is non-toxic to recipients at the dosages and 
concentrations employed and is compatible with other ingredients of the formulation. 
For example, the formulation preferably does not include oxidizing agents and other 
compounds that are known to be deleterious to polypeptides. 

10 Generally, the formulations are prepared by contacting the polypeptide 

uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 
of the recipient. Examples of such carrier vehicles include water, saline, Ringer's 

15 solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 

The carrier suitably contains minor amounts of additives such as substances 
that enhance isotonicity and chemical stability. Such materials are non-toxic to 
recipients at the dosages and concentrations employed, and include buffers such as 

20 phosphate, citrate, succinate, acetic acid, and other organic acids or their salts; 

antioxidants such as ascorbic acid; low molecular weight (less than about ten residues) 
polypeptides, e.g., polyarginine or tripeptides; proteins, such as serum albumin, 
gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; 
amino acids, such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 

25 disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 
manose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol 
or sorbitol; counterions such as sodium; and/or nonionic surfactants such as 
polysorbates, poloxamers, or PEG. 



WO 98/31800 



-47- 



PCWS98/00960 



The polypeptide is typically formulated in such vehicles at a concentration of 
about 0.1 mg/ml to 100 mg/ml, preferably 1-10 mg/ml, at a pH of about 3 to 8. It will 
be understood that the use of certain of the foregoing excipients, carriers, or stabilizers 
will result in the formation of polypeptide salts. 

5 Any polypeptide to be used for therapeutic administration must be sterile. 

Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 
into a container having a sterile access port, for example, an intravenous solution bag 
or vial having a stopper pierceable by a hypodermic injection needle. 

10 Polypeptides ordinarily will be stored in unit or multi-dose containers, for 

example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation, 10-ml 
vials are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, 
and the resulting mixture is lyophilized. The infusion solution is prepared by 

15 reconstituting the lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such containers) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or sale 

20 of pharmaceuticals or biological products, which notice reflects approval by the 
agency of manufacture, use or sale for human administration. In addition, the 
polypeptides of the present invention may be employed in conjunction with other 
therapeutic compounds. 

Having generally described the invention, the same will be more readily 

25 understood by reference to the following examples, which are provided by way of 
illustration and are not intended as limiting. 
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Examples 

EXAMPLE L Isolation of A Selected cDNA Clone From the Deposited Sample 

Each protein of the invention is related to a human complementary DNA 
(cDNA) clone prepared from a messenger RNA (mRNA) encoding the related protein. 
The cDNA clone related to each protein of the invention is identified by a "cDNA 
Clone ID (Identifier)" in Table 1, below (e.g., "HABCE99"). DNA of each cDNA 
clone in Table 1 is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown for each cDNA 
Clone ID in Table 1 . All deposits containing such clones have been submitted to the 
American Type Culture Collection (Rockville, Maryland USA) on the date indicated 
for each given accession number indicated in Table 1 . All deposits have been made in 
accordance with the Budapest Treaty, and in full compliance with 37 CFR §1.801 et 
seq. 

The cDNA clones contained in the ATCC deposits cited in Table 1 can be 
utilized by those of skill in the art by reference to the information describing each, 
clone, and by reference to SEQ ID NO:X, provided in table 1 for the determined 
nucleotide sequence of each deposited clone. The following additional information is 
provided for convenience. Each cDNA clone in a cited ATCC deposit is contained in 
a plasmid vector. Table 1 identifies the vector used to construct the cDNA library 
from which each clone was isolated. In many cases the vector used to construct the 
library is a phage vector from which a plasmid has been excised. The table 
immediately below provides a correlation of the related plasmid for each such phage 
vector used in construction of the cDNA library from which each cDNA clone listed 
in Table 1 originally was isolated. For example, where a particular clone is identified 
in Table 1 as being isolated in the vector "Lambda Zap," it can be seen from the 
following table that this cDNA clone contained in the biological deposit in 
pBluescript. 
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Vector Used to Construct Library 



Corresponding DepositedPlasmid 
pBluescript (pBS) 
pBluescript (pBS) 



Lambda Zap 
Uni-Zap XR 
Zap Express 



pBK 



lafinidBA 



plafinid B A 



pSportl 



pSportl 



pCMVSport 2.0 



pCMVSport 2.0 



pCMVSport 3.0 
pCR®2.1 



pCMVSport 3.0 
pCR®2.1 



Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 
XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al., Nucleic Acids Res. 
75:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
/7;9494 (1989)) and pBK (Alting-Mees, M. A. et al, Strategies 5:58-61 (1992)) are 
commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both may be transformed into E. coli strain XL- 
1 Blue, also available from Stratagene. pBS comes in 4 forms SK+, SK-, KS+ and KS- 
. The S and K refer to the orientation of the poly linker to the T7 and T3 primer 
sequences which flank the polylinker region ("S" is for Sad and "K" is for Kpril which 
are the first restriction enzyme sites on each respective end of the linker). "+" or 
refer to the orientation of the fl origin of replication ("ori"), such that in one 
orientation single stranded rescue initiated from the fl ori generates sense strand DNA 
and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0, were obtained from 
Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into £. coli strain 
DH10B, also available from Life Technologies. See, for instance, Gruber, C. E., et al, 
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Focus 75:59- (1993). Vector lafinid BA (Bento Soares, Columbia University, NY) 
contains an ampicillin resistance gene and can be transformed into E. coli strain XL-1 
Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
Carlsbad, CA 92008, contains an ampicillin resistance gene and may be transformed 
into E. coli strain DH10B, available from Life Technologies. See, for instance, Clark, 
J. M., Nuc. Acids Res. 75:9677-9686 (1988) and Mead, D. et al. t Bio/Technology 9: 
(1991). 

The deposited material in the sample assigned the ATCC Deposit Number 
cited in Table 1 for any given cDNA clone also may contain one or more additional 
plasmids, each comprising a cDNA clone different from that given clone. Thus, each 
cited deposit contains at least a plasmid for each cDNA clone identified in Table 1 as 
sharing the same ATCC Deposit Number. 

Two approaches are used herein to isolate a particular clone from the 
deposited sample of plasmid DNAs cited for that clone in Table 1, although others are 
known in art. In the first, a plasmid is isolated directly by screening clones using an 
oligonucleotide probe. To isolate a particular clone, a specific oligonucleotide with 30- 
40 nucleotides is synthesized using an Applied Biosystems DNA synthesizer 
according to the sequence reported. The oligonucleotide is labeled, for instance, with 
32 P-y-ATP using T4 polynucleotide kinase and purified according to routine methods 
(e.g., Maniatis et aL, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Press, Cold Spring, NY, 1982). The plasmid mixture is transformed into a suitable 
host, as indicated above (such as XL-1 Blue (Stratagene)) using techniques known to 
those of skill in the art such as those provided by the vector supplier or in related 
publications or patents cited above. The transformants are plated on 1 .5% agar plates 
(containing the appropriate selection agent, e.g., ampicillin) to a density of about 150 
transformants (colonies) per plate. These plates are screened using Nylon membranes 
according to routine methods for bacterial colony screening (e.g., Sambrook et al., 
Molecular Cloning: A Laboratory Manual, 2nd Edit., (1989), Cold Spring Harbor 
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Laboratory Press, pages 1.93 to 1.104), or other technique known to those of skill in 
the art. 

An alternative approach to isolate any polynucleotide of interest in the 
deposited library is to prepare two oligonucleotide primers of 17-20 nucleotides 
derived from both ends of the determined sequence for the selected clone (i.e., within 
the region of SEQ ID NO:X bounded by the 5' NT of the clone and the 3' NT of the 
clone defined in Table 1 for each cDNA clone identified therein. These two 
oligonucleotide primers are used to amplify the polynucleotide of interest using the 
deposited cDNA plasmid as a template. The polymerase chain reaction is carried out 
under routine conditions, for instance, in 25 |il of reaction mixture with 0.5 ug of the 
above cDNA template. A convenient reaction mixture is 1.5-5 mM MgCl 2 » 0.01% 
(w/v) gelatin, 20 \iM each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 
0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation at 94°C for 1 
min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) are performed with a 
Perkin-Elmer Cetus automated thermal cycler. The amplified product is analyzed by 
agarose gel electrophoresis and the DNA band with expected molecular weight is 
excised and purified. The PCR product is verified to be the selected sequence by 
subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5' or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 
include but are not limited to filter probing, clone enrichment using specific probes and 
protocols similar or identical to 5' and 3 1 "RACE" protocols which are well known in 
the art. For instance, a method similar to 5' RACE is available for generating the 
missing 5 f end of a desired full-length transcript. (Fromont-Racine et al., Nucleic Acids 
Res., 21(7):1683-1684 (1993). Briefly, a specific RNA oligonucleotide is ligated to 
the 5 r ends of a population of RNA presumably containing full-length gene RNA 
transcript and a primer set containing a primer specific to the ligated RNA 
oligonucleotide and a primer specific to a known sequence of the gene of interest, is 
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used to PCR amplify the 5' portion of the desired full-length gene which may then be 
sequenced and used to generate the full length gene. This method starts with total 
RNA isolated from the desired source; poly A RNA may be used but is not a 
prerequisite for this procedure. The RNA preparation may then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RN-A ligase step. The phosphatase if used 
is then inactivated and the RNA is treated with tobacco acid pyrophosphatase in 
order to remove the cap structure present at the 5' ends of messenger RNAs. This 
reaction leaves a 5' phosphate group at the 5* end of the cap cleaved RNA which can 
then be ligated to an RNA oligonucleotide using T4 RNA ligase. This modified RNA 
preparation can then be used as a template for first strand cDNA synthesis using a 
gene specific oligonucleotide. The first strand synthesis-reaction can then be used as a 
template for PCR amplification of the desired 5* end using a primer specific to the 
ligated RNA oligonucleotide and a primer specific to the known sequence of the gene 
of interest. The resultant product is then sequenced and analyzed to confirm that the 
5' end sequence belongs to the desired gene. 

EXAMPLE 2. Features of Proteins of the Invention 

Table 1, below, describes particular features of the proteins and related 
nucleotide and amino acid sequences of this invention. 
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FEATURES OF THE PROTEIN ENCOBED BY SEQ ID NO: 1 

The novel full-length chemotactic cytokine V (CCV) polypeptide exhibits 
significant sequence identity to a chemotactic protein isolated from the murine SI 00 
fraction designated CP-10 (chemotactic protein, 10 kD). The chemotactic cytokine V 
cDNA clone contains an 1091 nucleotide insert (SEQ ID NO:l) which encodes a 103 
amino acid polypeptide (SEQ ID NO:2), both shown in Figure 1. The clone was 
obtained from an induced endothelial cell cDNA library. A sequence alignment 
analysis of the deduced amino acid sequence of HEMFI85 shows that CCV shares 
approximately 24% identity and 69% similarity to the amino acid sequence of the 
murine CP-10 protein. In addition, it was determined by a BLAST analysis that the 
amino acid sequence of chemotactic cytokine V also exhibits approximately 31% 
identity and 67% similarity to the previously described rat intracellular Ca2+-binding 
protein. An examination of expression of chemotactic cytokine V in the HGS 
database reveals a widespread cell and tissue distribution of this gene. Expression of 
this clone was observed in a wide variety of human cDNA libraries in the Human 
Genome Sciences, sine. (HGS) express sequence tag (EST) database including colon 
carcinoma (HCC) cell line, smooth muscle, amygdala depression, keratinocytes, 
uninduced endothelial ceils, osteoblasts, and others. 

CP-10 is a potent factor capable of extravascular recruitment of 
polymorphonuclear cells (PMN) and monocytes from circulation. Optimal 
chemotactic activity of CP-10 for murine PMN and neutrophils is in the range of 10- 
11 and 10-13 M, making this factor one of the most potent chemotactic factors 
reported to date. CP-10 is the murine homologue of a human SI 00 protein designated 
migration inhibition factor-related protein 8 (MRP8). MRP 8 can occur as a complex 
with an additional human SI 00 protein termed MRP 14 (the complex has previously 
been reported as the cystic fibrosis antigen, calgranulin A and B, or LI antigen). This 
complex can comprise as much as 10-20% of the total cytoplasmic protein content of 
resting neutrophils and, although a significantly lower percentage of total cytoplasmic 
protein content, MRP8/14 complexes can also be found in resting monocytes. There 
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is also evidence that suggests that MRP8/14 may be released from myeloid cells, 
although it is not clear whether the complex is actively released as part of a response 
to inflammation or passively as a part of the demise of such cells during the 
inflammatory process. 

The function(s) of MRP8/14 complexes, CP- 10, and related S100 fraction 
Ca2+-binding proteins are not entirely clear. However, it is thought that a major 
functional role of such proteins is in the recruitment of certain populations of immune 
cells to areas of inflammation. Devery and coworkers (J. Immunol. 152, 1888-1897; 
1994) have demonstrated that expression of cell surface molecules such as Mac-1, 
which is involved in the process of cell adhesion as well as several additional cellular 
processes, may be influenced by prior interaction of the cell with chemotactic factors 
such as CP-10. These studies have also been performed in vivo where it was observed 
that CP- 10 protein accumulated on the endothelial lining of small blood vessels in 
LPS-inflamed footpads. Furthermore, increased levels of MRP8/14 have been - 
observed in the sera of patients afflicted with several inflammatory diseases including 
rheumatoid arthritis. It has also been suggested that chemotactic cytokine molecules 
such as CP-10 or MRP8/14 may function as a type of "calcium sink" during times of 
elevated intracellular levels of calcium for sustained periods of time. Alternatively, it 
has been suggested that MRP8/14 may function as a specific inhibitor of casein kinase 
II acitivity. Although the precise functional role(s) of many of the currently defined 
chemotactic cytokine-like proteins containing significant regions of sequence identity 
to HEMFI85 are not known in any detail, a number of studies with these proteins 
strongly suggest one or more roles for these proteins in a variety of human disease 
states including rheumatoid arthritis, sarcoidosis, tuberculosis, onchocerciasis, and 
other chronic inflammatory disease states. As a result, the discovery of a novel 
chemotactic cytokine-like molecule is believed to be of value in a variety therapeutic 
and diagnostic capacities. 

Owing to the homology to CP-10 and other calcium binding proteins it is 
expected that the CCV polypeptide shares possess common bioactivities. The 
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activity of CCV may be assayed by any of several biological assays known in the art, 
preferrably calcium binding assays. The homology to CP- 10 and other calcium 
binding proteins indicates that the CCV polypeptide is useful in the detection and 
treatment of chronic inflammatory diseases such as rheumatoid arthritis, sarcoidosis, 
tuberculosis and onchocerciasis. 

FEATURES OF THE PROTEINS ENCODED BY SEQ ED NOS: 3 and 5 
The full-length nucleotide sequences of two novel human cDNA clones 
(HTXET53 and HT3SG28) which encode splice variants of the previously reported 
and highly related chemokines LAG-2, NKG5, and 519 have recently been identified. 
See for example, Hercend and Triebel (WPI Acc. No. 90-132241/17). These two 
clones have been designated Chemokine from Activated T-Cells-1 (CAT-1) 
(HTXET53), and Chemokine from Activated T-Cells-2 (CAT-2) (HT3SG28). 

The HTXET53 clone was obtained from a human activated (12 hour) T-cell 
cDNA library and contains a 887 nucleotide insert (SEQ ID NO:3) which encodes a 
172 amino acid polypeptide (SEQ ID NO:4), shown in Figure 2. The HT3SG28 clone 
was obtained from a human activated (8 hour) T-cell cDNA library and contains a 550 
nucleotide insert (SEQ ID NO:5) which encodes an 88 amino acid polypeptide (SEQ 
ID NO:6), shown in Figure 3. The predicted amino acid sequences of the novel full- 
length CAT splice variants contain several regions of nearly perfect sequence identity 
to the previously reported human LAG-2, NKG5, and 519 lymphokines. Alignment 
of the amino acid sequences shows perfect identity between the two novel molecules 
with LAG-2 and NKG5, with the exception of a 27 amino acid insertion near the 
amino terminus of HTXET53, and a 57 amino acid deletion very near the carboxy 
terminus of HT3SG28. The 519 amino acid sequence differs from each of the novel 
clones and from LAG-2 and NKG5 by an 18 amino acid deletion of the hydrophobic 
amino terminus. 

The HTXET53 polypeptide is predicted to have a 15 amino acid secretory 
leader sequence. The HT3SG28 polypeptide is predicted by the computer program 
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PSORT to have either a 15 or a 22 amino acid leader sequence. The leader sequences 
are underlined in Figures 2 and 3. Applicants believe that both the shorter and longer 
form of the HT3SG28 polypeptides (i.e., begining at either residue 16 or residue 23) 
are active. 

Expression profiles of the two novel genes are qualitatively identical in the 
HGS database. Additional HGS human cDNA libraries which contain the two novel 
CAT clones are resting T-cells, apoptotic T-cells, activated T-cells, spleen (chronic 
lymphocytic leukemia), activated monocytes, pituitary, and 9 week early stage 
human. The mRNA expression patterns of these novel genes have not been examined 
by Northern blot analysis. 

The original molecule cloned from this group the T-cell-specific clone 519. 
NKG5 was a term used to describe a group of identical clones isolated from a human 
natural killer (NK) cell cDNA library. These genes are highly related and are thought 
to be expressed only in NK and T-cells. A genomic clone of the gene which encodes 
both 519 and NKG5 consists of at least five exons and four introns which are likely 
responsible for the generation of the related, but unique gene products. The genomic 
clone also reveals a number of T-cell-specific and activation state-specific regulatory 
sequences indicating that expession of the gene is highly restricted to certain functions 
of a small subset of cell types. 

The novel and previously described molecules discussed herein also contain 
approximately 33% identity with a recently reported clone designated NK-lysin. 
NK-lysin has been found to exhibit a potent anti-bacterial activity against such 
organisms as Escherichia coli, Bacillus megaterium, Acinetobacter calcoaceticus, and 
Streptococcus pyogenes. In addition, NK-lysin was also observed to possess a 
marked lytic activity against an NK-cell-sensitive mouse tumor cell line (YAC-1), but 
had no such activity against erythrocytes. As a result, there are a number of potential 
therapeutic and/or diagnostic applications for a factor such as those encoded by 
HTXET53 and HT3SG28. Applications may include the detection and treatment of 
such clinical presentations as various bacterial infections, a number of lymphomas, 



WO 98/31800 



PCT/US98/00960 



-58- 

immunological disorders, autoimmune diseases, inflammatory diseases, various 
allergies, and possibly as anti-infectious agents. 

FEATURES OF THE PROTEINS ENCODED BY SEQ ID NOS: 7 and 9 

5 The novel Melanoma Inhibitory Activity Protein (MI A)-2 and -3 cDNA clones 

presented herein are shown in Figures 4 and 5. The cDNA clone HBZAK03 contains 
a 520 nucleotide insert (SEQ ID NO:7) which encodes a 59 amino acid polypeptide 
(SEQ ID NO:8), as shown in Figure 4. A BLAST analysis of the predicted amino acid 
sequence of HBZAK03 demonstrates that this novel clone appears to be a splice 

10 variant of another cDNA clone designated HLFBD44. The nucleotide sequence of 
HLFBD44 (SEQ ID NO:9) and deduced amino acid sequence (SEQ ID NO: 10) are 
shown in Figure 5. Both of these HGS clones exhibit significant sequence identity to a 
human gene termed melanoma inhibitory activity (MIA) protein. BestFit analysis 
demonstrates that the HBZAK03 protein exhibits approximately 20% identity and 

15 58% similarity to the MIA protein over a region of roughly 60 amino acids. The 
expression profile of the HBZAK03 cDNA in the HGS database reveals that it 
appears in a number of HGS human cDNA libraries in addition to the prostate cDNA 
library from which it was cloned. Some of the cDNA libraries in which this clone 
appears include fetal lung, the bone marrow cell line (RS4;11), macrophage, serum- 

20 treated smooth muscle, epileptic frontal cortex, subtracted fetal brain, HSA 172 cell 
line, induced endothelial cells, and others. 

The highest sequence identity of the novel cDNA clones presented herein 
suggests that they may possess a function involved in the regulation of melanoma 
progression. The previously described MIA protein functions as a component of a 

25 highly complex and only partially characterized system of stimulatory and inhibitory 
factors which together dictate the progression of a melanoma. MIA is secreted by 
malignant melanoma cells and has the capacity to inhibit the growth of melanoma cells 
in culture. Investigators have examined the expression profile of the MIA gene by 
Northern blot and RT-PCR analysis and have determined that it is expressed in all 
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melanoma cell lines, a few glioma cell lines, approximately half of the benign 
melanomas, all malignant melanomas, and from all lymph node metastases of 
malignant melanomas examined (Bosserhoff et al, J. Biol Chem. 271, 490-495; 1996). 
In contrast, no MIA expression was detected by these methods in samples obtained 
from any other skin-derived cells including normal fibroblasts, HaCaT keratinocytes, 
COS cells, HeLa cells, HepG2 cells, DU 145 (human prostate carcinoma) cells, and 
J82 (human bladder carcinoma) cells. 

Based on the sequence similarity between these polypeptides MIA-2 and -3 are 
predicted to be useful in the detection and regulation of malignant melanoma, in 
immune system modulation, and in the treatment of cardiac arrest and stroke. Other 
activities of MIA- 1 as well as assays for detecting MIA-1 activity are outlined in WO 
95/03328, hereby incorporated herein by reference in its entirety. MIA-2 and -3 
activity can be assayed accordingly. 

FEATURES OF THE PROTEINS ENCODED BY SEQ ID NOS: 1 1 and 13 

A macrophage-specific protein, termed AIF-1, has only very recently been 
molecularly cloned. AIF-1 appears to function in macrophage activation in the 
pathogenesis of chronic cardiac rejection following transplantation. A characteristic 
manifestation of cardiac tissue rejection following transplantation is an immune- 
mediated arteriosclerosis which ultimately results in graft failure and creates the need 
for retransplantation during the first postoperative year. It is thought that the 
arteriosclerotic state results from an alloimmtine response involving activated immune 
cells, particularly macrophages, which stimulate smooth muscle-cell migration and 
proliferation into the area of the transplant leading to lesions in donor vessels. AIF-1 
was identified by Utans and coworkers (J. Clin. Invest. 95, 2954-2962; 1995) in 
ongoing studies of inducible gene expression patterns in macrophage cells in a chronic 
rejecting rat heart allograft model. AIF-1 was expressed in response to INF-g in the 
chronic cardiac rejection model referenced above. Expression of AIF-1 was seen 
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selectively in activated macrophages, neutrophils, and the macrophage-like cell lines 
THP-1, U937, and HL60, but not in several other human cells and tissues examined 
Furthermore, low levels of AIF-1 expression can be observed in endomyocardial 
biopsy samples obtained from human heart transplant patients. 

The cDNA clone designated HEBGM49 or "AIF-2" contains a 632 nucleotide 
cDNA insert (SEQ ID NO: 11) encoding a 150 amino acid polypeptide (SEQ ID 
NO: 12), as shown in Figure 6. The cDNA clone was isolated from a human early 
stage brain cDNA library. This clone also appears in several other cDNA libraries 
constructed from a variety of human cell and tissue types including fetal epithelium, 
fetal kidney, hippocampus, tongue, and osteoblastoma HOS cells. A BLAST analysis 
of the amino acid sequence of HEBGM49 demonstrated that this clone exhibits 
approximately 65% identity and 80% similarity with AIF-1 over its entire length. 

- The cDNA clone HNGBH45 or "AIF-3" contains a 757 nucleotide cDNA insert 
(SEQ ID NO: 13) encoding a 193 amino acid polypeptide (SEQ ID NO: 14), as shown 
in Figure 7. The cDNA clone was isolated from a human neutrophil cDNA library. 
This clone appears in a number of additional cDNA libraries including aortic 
endothelium, cerebellum, corpus cdllosum, CD34-depleted buffy coat, activated 
neutrophil, colon cancer, resting T-cells, tonsils, and others. A BLAST analysis of the 
amino acid sequence of HNGBH45 demonstrated that this clone exhibits 
approximately 25% identity and 47% similarity over approximately 70 amino acids of 
the AIF-1 molecule. 

AIF-2 and AIF-3 are believed to be valuable clinical markers for 
assessing Varying degrees of acute and chronic rejection of transplanted cardiac tissue. 
In addition, monitoring the level of AIF-2 and/or AIF-3 expression may also be useful 
in determining the level of macrophage or neutrophil infiltration into area of the 
transplanted tissue. In addition, AIF-2 and -3 may be used as targets in assays for the 
identification of antagonists such as small orgainic molecules which act to block AIF 
activity. Such assays are known in the art. 
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FEATURES OF PROTEIN ENCODED BY SEQ ID NO: IS 

The full-length nucleotide sequence of a novel human cDNA clone 
(HSAAL25) has been isolated which is believed to encode a new member of the 
annexin/lipocortin supergene family. The novel polypeptide is termed herein 
"Annexin HSAAL25". The annexin/lipocortin supergene family is composed of at 
least ten calcium-binding proteins proposed to function in a variety of cellular roles 
including phospholipase A2 and protein kinase C inhibition, anti-coagulation, endo- 
and exo-cytosis, inositol phosphate metabolism, and as calcium channel proteins. 
Eukaryotic calcium-binding proteins are typically classified as proteins which bind 
calcium by a mechanism which either includes or does not include an E-F hand motif. 
The annexin/lipocortin superfamily is the largest group of calcium-binding proteins 
whose interaction with calcium is not mediated by an E-F hand motif. Structurally, all 
known annexins may be characterized by a common carboxy terminal region consisting 
of four similar amino acid sequences, of approximately seventy amino acids each, 
termed the "annexin repeats". Conversely, the amino termini of annexin/lipocortin 
proteins vary widely in both length and amino acid composition between member 
protein sequences. Typical expression patterns of annexin/lipocortin proteins include 
a wide variety of cells and tissues including lung, kidney, bone marrow, spleen, 
thymus, brain, macrophage, placenta, ovary, uterus, skeletal muscle, and others. 

Annexin/lipocortin proteins are involved in a wide variety of physiologically 
important cellular processes. For example, lipocortin-1 (LOl; also known as annexin- 
I) appears to function as a second messenger in the anti-inflammatory glucocorticoid 
signal transduction cascade. Most LC-1 molecules are cell surface-associated and 
attached to the plasma membrane by a Ca2+-dependent interaction with unrelated 
plasma membrane binding molecules. The process of extravasation, in which 
polymorphonuclear leukocytes (PMNs) migrate into an area of inflammation, adhere 
to the vascular wall, and eventually pass through the vascular wall into the 
surrounding tissue* may be delayed by glucocorticoids, and, as a result of LC-1 
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function, the overall process of inflammation may be delayed. As an example of the 
diversity of LC-1, and other annexin/lipocortin superfamily member, function, LC-1 
has also been shown to play a major regulatory role in a number of possibly unrelated 
cellular systems such as cell growth regulation and differentiation, response of the 
CNS to cytokines, neuroendocrine secretion, anti-coagulation, and neurodegeneration. 

Annexin HSAAL25 contains a 1 356 nucleotide cDNA insert (SEQ ID NO: 1 5) 
encoding a 324 amino acid polypeptide (SEQ ID NO: 16), as is shown in Figure 8. 
HSAAL25 was isolated from a cDNA library made from the HSA 172 cell line. 
* Although previously described annexin/lipocortin proteins are widely expressed, this 
clone also appears only once in the HSA 172 cell line cDNA library and does not 
appear in any other tissue type assayed for. A BLAST analysis of the amino acid 
sequence of HSAAL25 demonstrated that this clone exhibits at least 30% identity and 
55% similarity over the entire length of a molecule designated human annexin-IH, a 
member of the annexin/lipocortin supergene family. 

There is clearly a need for identifying and exploiting novel members of the 
annexin/lipocortin superfamily such as the cDNA clone described herein. Plasma 
membrane-associated molecules, such as the novel potential members of the 
annexin/lipocortin superfamily detailed here, should prove useful in target based 
screens for small molecules and other such pharmacologically valuable factors that 
may be useful for regulating the complex processes of inflammation. Furthermore, 
Annexin HS AAL25 is believed to be useful as a regulator of coagulation (anti- 
coagulant) by affecting Ca2+-dependent cell to cell aggregation. In addition, this 
annexin-like clone may prove valuable in a number of other therapeutically useful roles 
as an anti-inflammatory agent including regulation of ischemia, tumor metastasis, 
rheumatoid arthritis, other inflammatory diseases, wound healing, arteriosclerosis, and 
other heart diseases. 



FEATURES OF PROTEIN ENCODED BY SEQ ID NO: 17 
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The fall-length nucleotide sequence of a novel human cDNA (HUSAX55) which 
encodes a previously unidentified "ES/130-like I" protein has been identified. The 
translation product of the novel full-length ES/130-like I cDNA clone exhibits 
significant sequence identity to the chicken EDTA-soluble/130 kDa protein (ES/130) 
gene. The ES/130-like I cDNA clone contains an 3036 nucleotide insert (SEQ ID 
NO: 17) which encodes a 977 amino acid polypeptide (SEQ ID NO: 18), as shown in 
Figure 9. The clone was obtained from an umbilical vein endothelial cell cDNA 
library. A BLAST analysis of the deduced amino acid sequence of HUSAX55 
exhibits approximately 66% identity and 83% similarity to the amino acid sequence of 
the chicken ES/130 gene over a 573 amino acid stretch. Expression of ES/130;like I is 
detected in a wide collection of HGS human cDNA libraries including amygdala 
depression, thymus, smooth muscle, endometrial tumor, synovial sarcoma, 
macrophage, fetal heart, and a number of others. Northern blot analyses performed on 
expression of the ES/130-like I gene indicates a high level of expression in pancreas 
and liver and moderate to low expression elsewhere. 

The in vitro process of endothelial cell transformation to mesenchymal tissue 
models a similar in vivo process in the developing heart where closely associated 
epithelial cells undergo a transformation to cardiac mesenchyme tissue. This 
transformation is a required event for the development of a multichambered heart from 
the primative, single chambered heart tube. ES/130 was originally identified as a 130 
kD antigen isolated from the 100,000 x g pellet fraction of non-cytolytic EDTA 
extracts of developing chicken cardiac tissue. Inclusion of this fraction in cardiac 
endothelial cell cultures results in formation of mesenchymal tissue. ES/130 is an 
extracellular, secreted protein which, in addition to endothelial cell transformation, has 
been proposed to function in the regulation of adhesion molecule expression and limb 
bud ectoderm, neural tube, and notocord development. Potential therapeutic and/or 
diagnostic applications for the ES130-like I protein include such clinical presentations 
as atherosclerosis, restenosis, or as a general factor following a number of types of 
surgery. 
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FEATURES OF THE PROTEIN ENCODED BY SEQ ED NO: 19 

The full-length nucleotide sequence of a human cDNA clone (HSXCK41) which 
encodes a novel hrain-enriched hyaluronan-binding factor ("BEF") has been 
5 determined. The novel BEF cDNA clone presented herein was discovered in a human 
substantia nigra cDNA library. The clone contains a 1757 nucleotide insert (SEQ ID 
NO: 19) which is predicted to encode a 528 amino acid polypeptide (SEQ ID NO:20). 
A BLAST analysis of the predicted amino acid sequenc of HSXCK41 demonstrates 
significant sequence identity to the bovine brevican mRNA (GenBank entry X75887), 

10 a member of the aggrecan/versican family of cell surface proteoglycans. The 
HSXCK41 amino acid sequence exhibits approximately 92% identity and 95% 
similarity over an approximately 400 amino acid stretch of the brevican sequence. 
This clone has been identified in a number additional HGS human cDNA libraries, 
many of which originate fiom neural tissues. These include epileptic frontal cortex, 

15 early stage brain, skin tumor, hippocampus, cerebellum, hemangiopericytoma, infant 
brain, fetal brain, and fetal bone. 

The aggrecan/versican family of cell surface proteoglycans may be characterized 
by the presence of chondroitin sulfate side chains, a hyaluronic acid (HA)-binding 
motif in the amino terminal domain, and at least one epidermal growth factor (EGF)- 

20 like repeat, a lectin-like motif, and one or more complement regulatory protein (CRP)- 
like motifs in the carboxy terminal domain. The aggrecan/versican family includes a 
number of members such as brevican, aggrecan, decorin, versican, and neurocan. 
Brevican is expressed predominantly in the brain and in primary cerebellar astrocytes, 
but not in neurons. Meanwhile, both aggrecan and versican are expressed in 

25 chondrocytes in human articular cartilage obtained from subjects of a wide range of 
ages. Aggrecan messenger RNAs undergo alternative splicing events which vary the 
inclusion or exclusion of the single EGF-like motif in the carboxy terminal domain. 
Alternatively, versican contains two EGF-like motifs and a single CRP-like motif, all 
of which are present in aU expression patterns examined. Finally, the expression of 
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two recently described members of the aggrecan/versican family isolated from the 
human sciatic nerve is significantly increased following lesioning of the nerve. 

The functional roles of members of the aggrecan/versican family are rather varied 
Aggrecan itself aggregates with HA to function as a major space-filling component of 
5 cartilage. Brevican, an aggrecan/versican family member which is a conditional 
chondroitan sulfate proteoglycan, appears in a secreted, soluble form as well as in a 
GPI-anchored form. Both brevican isoforms have been implicated as functional 
components of the terminally differentiating and adult nervous systems. It will likely 
be determined that molecules such as these and the novel BEF cDNA clone 

10 HSXCK41 may play a role in one or more of a variety of cellular processes which 
typically involve intercellular contact and communication mediated through cell 
surface and/or secreted glycoprotein factors. Such cellular processes might include cell 
adhesion, proliferation, tumor metastasis, and lymphocyte migration into areas of 
inflammation. Related polypeptides are believed to be expressed at a higher level in 

15 tumors such as gliomas. Thus, BEF polynucleotides and polypeptides are useful as 
diagnostic markers and reagents for detection of tumors such as gliomas. 

FEATURES OF THE PROTEIN ENCODED BY SEQ ID NO: 21 

The full-length nucleotide sequence of a human cDNA clone (HFKFY79) which 

20 encodes a novel adipose differentiation factor ("ADF") has recently been determined. 
The novel ADF cDNA clone presented herein was originally isolated from a human 
fetal kidney cDNA library. The clone contains a 1550 nucleotide insert (SEQ ID 
NO:21) which encodes a 452 amino acid polypeptide (SEQ ID NO:22), as shown in 
Figure 11. A BLAST analysis of the predicted amino acid sequence of HFKFY79 

25 demonstrates that this clone exhibits its highest degree of sequence relatedness in the 
GenBank public database to the murine ADF protein (GenBank accession number 
M93275). Based on its homology to murine ADF, human ADF is believed to share 
common biological activities. A BestFit analysis of the predicted amino acid sequence 
of HFKFY79 versus the murine ADF amino acid sequence demonstrates that the two 
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protein sequences exhibit approximately 39% identity and 79% similarity. The 
expression profile of the HFKFY79 clone suggests a widely distributed expression 
pattern. In addition to the human fetal kidney library from which this clone was 
obtained, it also appears in a large number of human cDNA libraries including 
ulcerative colitis, adult testis, hypothalamus, induced endothelial cells, Jurkat T-cell 
line in S-phase, serum-treated and control smooth muscle, adipocytes, adult small 
intestine, lymph node breast cancer, infant brain, and many others. 

The murine ADF gene was cloned by Jiang & Serrero (Proc. Natl. Acad. Sci. USA 
89, 7856-7860; 1992, incorporated herein by reference) in an effort to identify genes 
whose expression profiles change significantly during the process of 1246 adipocyte 
cell and primary adipocyte differentiation. The murine ADF gene product identified 
by Jiang & Serrero is a 50 kD, membrane-bound protein expressed abundantly in 
mouse fat pads. The novel cDNA presented herein also exhibits sequence identity to 
several additional lipid-specific proteins. The first of the putative homologues is the 
major substrate for cAMP-dependent protein kinase A (PKA) in adipocytes and is 
termed perilipin. Perilipin is expressed in two alternatively spliced forms designated 
perilipins A and B. Both forms of perilipins are expressed exclusively at the surface 
of lipid storage droplets. It is thought that perilipids may function as a barrier to 
deny access of lipase to lipid reservoir of unstimulated cells. This event may be 
regulated by PKA-dependent phosphorylation of perilipin which allows exposure of 
lipid molecules to lipase. In addition, ADF is also related by sequence identity to a 
gene cloned from a human bone marrow-derived stromal cell line (KM-102) designated 
adipogenesis inhibitory factor (AGIF). AGIF has been shown to inhibit the process 
of adipogenesis in the mouse preadipocyte cell line 3T3-LL Thus, human ADF may 
be useful among other things as a therapeutic modulator of lipid metabolism in the 
human body. 
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FEATURES OF THE PROTEIN ENCODED BY SEQ ID NO: 23 

The novel "Bel-like" cDNA clone (HAICH28) presented herein was originally 
identified in a TNF-a/IFN-induced endothelial cell cDNA library. The clone contains 
a 1211 nucleotide insert (SEQ ID NO:23) which encodes a 365 amino acid 
polypeptide (SEQ ID NO:24). A BLAST analysis of the predicted amino acid 
sequence of HAICH28 demonstrates that this clone exhibits strong sequence 
similarity to two previously reported genes termed bovine polyA binding protein II 
and human Bcl-w (GenBank accession numbers X89969 and U59747, respectively). 
The expression profile of the HAICH28 clone suggests a widely distributed 
expression pattern. In addition to the TNF-a/IFN-induced endothelial cell library 
from which this clone was obtained, it also appears in a large number of human cDNA 
libraries including PHA-stimulated T-cells, osteoblasts, schizophrenic hypothalamus, 
activated monocytes, adrenal gland tumor, primary dendritic cells, and a number of 
others. 

The protein product of the related Bcl-w gene has been determined to function as 
a key player in the cellular apoptosis or cell death pathway. Apoptosis is a term 
which describes the process of programmed cell death in vertebrates. During the 
process of apoptosis, the cell membrane shrinks and blebs resulting in a loss of 
membrane integrity and intercellular contact. In addition, the chromatin is condensed 
and cleaved into a characteristic ladder-like organization and, finally, vesicular 
remnants of the cell are quickly engulfed and destroyed by neighboring cells. The 
signal for the cell to enter the apoptotic pathway likely begins with the binding of Fas 
ligand or tumor necrosis factor (TNF), or the recently discovered TRAIL ligand. to the 
Fas/CD95/APO-l or TNF (p55), or DR4 or DR5 receptors, respectively. These 
ligand/receptor interactions recruit a cellular protein designated FLICE to the cell 
membrane to act as a physical link between the Fas/CD95/APO-l and TNF receptor 
complexes, also termed death receptors, and the cysteine proteases belonging to the 
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interleukin-lb (IL-lb) converting enzyme (ICE)/CED-3 family to carry out the 
process of apoptosis. 

The t(14:18) chromosomal translocation is often associated with human follicular 
lymphoma. In this chromosomal abnormality, the immunoglobulin heavy chain locus 
becomes translocated adjacent to the Bcl-2 gene, resulting in a drastic overexpression 
of the Bcl-2 gene. Bcl-2 blocks the process of apoptosis by an unknown mechanism. 
It has been proposed that Bcl-2 controls the process of apoptosis by regulating 
endoplasmic reticulum-associated Ca2-f fluxes. Several other genes have been 
identified which have significant regions of sequence identity with Bcl-2, including 
Ced-9, BHRF1, Bax, Bcl-xS, Bcl-xL, Bcl-w, Bak, Mcl-1, and GRS. The protein 
product of each of these genes can affect the process of apoptosis in either a positive 
(for example, Bax or Bcl-xS) or negative (for example Bcl-2, BHRF1, Ced-9, or Bcl- 
xL) fashion. 

A large number of cells fall victim to the apoptotic process throughout 
development and during the lifetime of the organism. Clearly, strict regulation of the 
functional molecules comprising such a potentially dangerous process is an extremely 
necessary and valuable facet of the repertoire of cellular regulatory pathways. As a 
result, the identification of novel molecules related to Bcl-2 or Bcl-w, such as that 
encoded by the novel cDNA clone described herein, represents a major step in 
understanding, and, in turn, exploiting the complex process of controlled cell death. 
Accordingly, the Bel-like polypeptide of the present invention is thought to be useful 
as a therapeutic in an anti-viral or anti-tumor capacity or, alternatively, in a diagnostic 
capacity. 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuIe \3bis) 



A. The indications made below relate to the microorganism referred to in the description 
on page 53 , line ffi/A 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet 



Name of depositary institution 

American Type Culture Collection 



Address of depositary institution (including postal code and country) 

12301 Parklawn Drive 
Rockvlllep Maryland 20852 
United States of America 



Date of deposit 

May 16, 1997 


Accession Number 

209053 


C ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet | | 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 

The indications listed below will be submitted to the International Bureau later (specify the general nature of the indications e.g. t 
Number of Deposit*) 



For receiving Office use only 



This sheet was received with the international application 



Authorized o 




TOS^JS-7453 



For International Bureau use only 



| | This sheet was received by the International Bureau on: 



Authorized officer 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRulel36») 



A. The indications made below relate to the microorganism referred to in the description 
on page 

53 ,line H/A 



B IDENTIFICATION OF DEPOSIT Further deposits arc identified on an additional sheet | | 



Name of depositary institution 

American Type Culture Collection 



Address of depositary institution (including postal code and country) 

12301 Parklaun Drive 
Rockrille, Maryland 20852 
United States of America 



Date of deposit _ 

May 16, 1997 



Accession Number 

209054 



ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet Q 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 

The indications listed below will be submitted to the International Bureau later (specify the general nature of the indications e.g.. 
Number of Deposit 1 ) 




For receiving Office use only 



is sheet was received with the international application 



Authorized d 



For International Bureau use only 



| | This sheet was received by the International Bureau on: 



Authorized officer 
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What Is Claimed Is: 



1 . An isolated nucleic acid molecule comprising a nucleotide sequence 
which is at least 95% identical to a nucleotide sequence encoding a polypeptide 
5 selected from the group consisting of: 





(a) 


the polypeptide shown in SEQ ID NO:2; 






(b) 


the polypeptide shown in SEQ ID NO:4; 






(c) 


the mature polypeptide shown as residues 16-172 in SEQ ID NO:4; 




(d) 


the polypeptide shown in SEQ ID NO: 6; 




10 


(e) 


the mature polypeptide shown as residues 16-88 in SEQ ID NO:6; 




(f) 


the mature polypeptide shown as residues 23-88 in SEQ ID NO:6; 




(g) 


the polypeptide shown in SEQ ID NO:8; 






00 


the polypeptide shown in SEQ ID NO: 10; 






®. 


the polypeptide shown in SEQ ID NO: 12; 




15 


0) 


the polypeptide shown in SEQ ID NO: 14: 






(k) 


the polypeptide shown in SEQ ID NO: 16 






0) 


the polypeptide shown in SEQ ID NO:18 






(m) 


the polypeptide shown in SEQ ID NO:20 






(n) 


the mature polypeptide shown as residues 16-528 in SEQ ID NO:20; 


20 


(o) 


the polypeptide shown in SEQ ID NO:22; and 




(P) 


the polypeptide shown in SEQ ID NO:24. 



2. The nucleic acid molecule of claim 1 comprising a nucleotide sequence 
which is at least 95% identical to a nucleotide sequence selected from the group 
25 consisting of: SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID 
NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 
NO:19, SEQ ID NO:21, and SEQ ID NO:23. 
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3 . An isolated nucleic acid molecule of claim 3 comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

4. An isolated nucleic acid molecule which hybridizes under stringent 
hybridization conditions to a nucleic acid molecule of claim 1, wherein said nucleic 
acid molecule which hybridizes does not hybridize under stringent hybridization 
conditions to a nucleic acid molecule having a nucleotide sequence consisting of only A 
residues or of only T residues. 

5 . An isolated nucleic acid molecule of claim 6 comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 500 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

6. An isolated polypeptide comprising an amino acid sequence which is 
identical to a sequence of at least about 10 contiguous amino acids in the amino acid 
sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 . 

7. An isolated polypeptide of claim 6 comprising an amino acid sequence at 
least 95% identical to the complete amino acid sequence of SEQ ID NO:Y. 

8. An isolated polypeptide comprising an amino acid sequence identical to a 
sequence of at least about 10 contiguous amino acids in the complete amino acid sequence 
of a secreted protein encoded by a human cDNA clone identified by a cDNA Clone 
Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table 1 . 

9. A method of making a recombinant vector comprising inserting an isolated 
nucleic acid molecule of claim 1 into a vector. 
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10. A recombinant vector produced by the method of claim 9. 

11. A method of making a recombinant host cell comprising introducing a 
vector of claim 1 0 into a host cell. 

12. A recombinant host cell produced by the method of claim 12. 

13. A method of making an isolated polypeptide comprising culturing a 
recombinant host cell of claim 12 under conditions such that said polypeptide is 
expressed and recovering said polypeptide. 

14. An isolated polypeptide produced by the method of claim 13. 

15. An isolated antibody capable of specifically binding to a polypeptide of 

claim 6. 
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